Claude Sonnet 4.6
undisclosed
Benchmark scores
79.6 SWE-bench Verified
88.5 GPQA Diamond
1478 LM Arena Elo
Available via: APIChatBatchAgent SDKManaged Agents
Claude Sonnet 4.6 is the best value proposition in the Claude lineup — 79.6% on SWE-bench Verified (just 1.2 points behind Opus) at 40% of the cost.
Benchmarks
| Benchmark | Score | vs Opus 4.6 |
|---|---|---|
| SWE-bench Verified | 79.6% | -1.2 pts |
| GPQA Diamond | 88.5% | -5.8 pts |
| LM Arena Elo | 1478 | -26 pts |
The gap between Sonnet and Opus is the narrowest it has ever been. For most tasks, the quality difference is imperceptible.
Pricing
| Per 1M tokens | vs Opus | |
|---|---|---|
| Input | $3.00 | 40% cheaper |
| Output | $15.00 | 40% cheaper |
Architecture & capabilities
- Context: 1M tokens — same as Opus, at standard pricing
- Output: Up to 64K tokens
- Thinking modes: Adaptive, extended, and interleaved thinking — same as Opus
- Shares all core Opus improvements
Strengths
- Best price/performance ratio in the frontier tier
- 1M context at $3/$15 — cheaper than most competitors
- Same thinking capabilities as Opus
- Strong enough for daily coding, analysis, and agentic work
Weaknesses
- Slightly weaker on graduate-level science reasoning (GPQA gap)
- Lower LM Arena ranking — noticeable in head-to-head comparisons
When to use
Default choice for most tasks. Use Opus only when you need the absolute best reasoning quality or are working on PhD-level scientific analysis.