DeepSeek V4
1T
Benchmark scores
72.5 SWE-bench Verified
84 GPQA Diamond
1445 LM Arena Elo
Available via: DeepSeek APIOpenRouterSelf-hosted
DeepSeek V4 is the largest freely available AI model — 1 trillion parameters at a price that undercuts every competitor by an order of magnitude.
Benchmarks
| Benchmark | Score | Notes |
|---|---|---|
| SWE-bench Verified | 72.5% | Competitive with models 10x the price |
| GPQA Diamond | 84.0% | Approaching frontier scores |
| LM Arena Elo | 1445 | Top-tier for open models |
Pricing
| Per 1M tokens | vs Claude Opus | |
|---|---|---|
| Input | $0.28 | 18x cheaper |
| Output | $1.10 | 23x cheaper |
The price/performance ratio is staggering. For workloads that don’t need the absolute best quality, DeepSeek V4 delivers 85-90% of frontier performance at 5% of the cost.
Architecture
Builds on the V3 and R1 series, focusing on:
- Frontier reasoning quality via massive MoE architecture
- Improved long-context efficiency
- Enhanced tool-use and function calling
- Strong agentic workload performance
Variants
| Model | Params | Cost (input/M) | Use case |
|---|---|---|---|
| DeepSeek V4 | 1T | $0.28 | Full quality |
| DeepSeek V4 Lite | ~200B | ~$0.10 | Budget, matches frontier on limited compute |
Strengths
- Price/performance king — frontier-adjacent quality at commodity pricing
- Open weights — self-host for zero marginal cost
- 1T parameters deliver genuine reasoning depth
- V4 Lite variant for ultra-budget deployments
Weaknesses
- 128K context window — much smaller than Claude (1M) or Llama (10M)
- Self-hosting requires massive GPU cluster
- Chinese company — some enterprises have compliance concerns
- Trails frontier models by 8-10 points on top benchmarks