Home/WorldofAI/DeepSeek V4

DeepSeek V4

WorldofAI · 25 Claims

Model Release
Neutral
DeepSeek released two new models as part of V4: Pro and Flash.
The transcript states the team is releasing two new models, introducing DeepSeek version 4 preview.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Technical Specifications
Neutral
Both models feature a cost-effective 1 million context length.
The author notes the models are built around a 1 million context length.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Neutral
DeepSeek V4 Pro has 1.6 trillion total parameters and 49 billion active parameters.
The transcript provides exact parameter counts for the Pro model.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Neutral
DeepSeek V4 Flash is a faster, cheaper model with 284 billion total parameters and 13 billion active parameters, offering near-pro reasoning on simpler agent tasks.
Transcript provides Flash specifications and targeted use case.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Performance Claims
Neutral
DeepSeek claims V4 Pro is the top open-source model in reasoning, STEM, coding, forensic workflows, and world knowledge, rivaling leading closed-source models.
The author reports DeepSeek's own claims about V4 Pro's performance.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Neutral
DeepSeek claims V4 is on the level of Opus 4.5 on real-world agentic coding tasks and in certain cases beats Opus 4.6.
Author reports DeepSeek's performance comparison claims.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Benchmark Skepticism
Disagree
The author believes DeepSeek V4 is benchmark-maxed, meaning the team optimized heavily for benchmarks in their favor.
Author explicitly states 'I personally see that as benchmark maxed' and that the model is optimized for benchmarks.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Disagree
In benchmarks, DeepSeek claims V4 beats Claude Opus 4.6 Max, Gemini 3.1 Pro, and GPT 5.4 High in certain areas, but the author does not agree.
Author reports the benchmark claims but expresses strong personal disagreement.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Performance Comparison
Disagree
In the author's testing, Kim K 2.6, Quen, and Minia Max N2.7 outperform DeepSeek V4 preview in extended thinking and real-world use cases.
Author reports personal testing results showing other models outperform DeepSeek V4.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Future Expectations
Neutral
The current V4 is a preview release and an official version could fix current drawbacks.
Author notes it's a preview and hopes official release will address issues.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Open Source and Pricing
Agree
The model has strong long context performance, extremely low cost, and is open-source under the MIT license.
Author highlights these as positive features: strong context, low cost, MIT license.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Real-world Performance
Disagree
From the author's testing, the model feels subpar, sloppy, and lazy in execution.
Author says 'model feels subpar in so many ways... super sloppy, lazy in execution'.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Real-world Testing
Disagree
In a browser-based Mac OS clone test, DeepSeek V4 generated a lackluster, basic output without SVG icons or proper UI thinking, failing to capture macOS structure.
Author describes the generation as super basic, no creativity, no SVG icons for apps.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Disagree
DeepSeek V4 Flash sometimes performs better than Pro in certain prompting cases but still delivers lackluster results.
Author says flash mode can get the job done in a better fashion but is still lackluster.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Disagree
DeepSeek V4 Pro failed to finish generating an off-road EV durability test, while Miniax M2.7 completed it with interactive car movement.
Author describes incomplete generation and contrasts with competitor's completion.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Disagree
In a SAS landing page generation, DeepSeek V4 Pro's quality resembled GPT-3.5 output.
Author says 'it looks like the GV3.5 in terms of its generation quality'.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Comparison with Competitors
Disagree
In a Slack clone generation, DeepSeek V4's output mimicked some structure but did not look like the actual app, while GLM 5.1 captured the Slack tone and style accurately.
Author compares the two models' front-end clones, favoring GLM 5.1.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Disagree
In SVG generation, DeepSeek V4 Pro produced a decent butterfly but Quen 3.6 Plus outperformed it in almost every fashion.
Author shows SVG comparison and says 'you can clearly tell that Quen 3.6 Plus is better'.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Disagree
For a 3D PS5 controller task, DeepSeek V4 Pro generated an object resembling a table, while Quen produced a recognizable controller.
Author shows comparison and says 'this doesn't even look like a controller. It looks like a freaking table.'
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Disagree
In an Instagram feed clone, MiniAX M2.7 replicated the feed correctly, while DeepSeek V4 Pro was buggy and failed to compile.
Author says DeepSeek can't even complete it, messy, bugs.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Benchmark Performance
Disagree
On Code Arena, DeepSeek V4 ranks number three behind GLM 5.1 and Kim K 2.6, not number one.
Citing Arena leaderboard as evidence that it does not dominate coding tasks.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Pricing
Neutral
DeepSeek V4 Pro pricing is 14 dollars per 1 million input tokens and 348 dollars per 1 million output tokens.
Author provides exact pricing details for Pro model.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Neutral
DeepSeek V4 Flash pricing is 3 cents per 1 million input tokens and 28 cents per 1 million output tokens.
Author provides exact pricing details for Flash model.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Overall Assessment
Disagree
The author states DeepSeek V4 is nowhere near other Chinese models like Miniax or Kim K 2.6.
Explicit statement: 'Deep Seek 4 is nowhere near even any of the Chinese models that have been released'.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)
Disagree
Just because the model is cost-efficient does not automatically make it great; cheaper does not mean better.
Author's concluding statement that cost-efficiency does not equate to quality.
Source: Deepseek v4: Best Opensource Model Ever? (Fully Tested)