insidejob
closed source Google

Gemini 3.1 Pro

Context 2M tokens
Max output 64K tokens
Architecture Dense transformer, native multimodal
Pricing (per 1M tokens) $2 in / $12 out

Benchmark scores

78.8 SWE-bench Verified
90.8 GPQA Diamond
77.1 ARC-AGI-2
1493 LM Arena Elo
Available via: APIChatBatchVertex AI

Gemini 3.1 Pro is the strongest all-around model by multiple independent benchmarks — it leads or places top-3 on nearly every major evaluation.

Benchmarks

BenchmarkScoreRank
SWE-bench Verified78.8%Leading (among GA models)
GPQA Diamond90.8%#4
ARC-AGI-277.1%#1
LM Arena Elo1493#2

Pricing

Per 1M tokens
Input$2.00
Output$12.00

Cheaper than both Claude Opus and GPT-5.4 at the input tier.

Variants

ModelInput/Output (per 1M)Use case
Gemini 3.1 Pro$2.00/$12.00Flagship reasoning
Gemini 3.1 Flash$0.50/$3.00Fast, balanced
Gemini 3.1 Flash-Lite$0.10/$0.40Ultra-cheap, high volume
Gemini 3.1 UltraPremiumNative multimodal reasoning

Strengths

  • Best price among frontier models ($2/$12)
  • 2M context window — largest of any frontier model
  • Native multimodal (Ultra variant) — not bolted-on vision
  • ARC-AGI-2 leader — strongest abstract reasoning

Weaknesses

  • GPQA trails Claude Opus by 3.5 points
  • Google’s API ecosystem (Vertex AI) adds complexity vs simpler APIs
  • Ultra pricing not yet publicly available

Architecture

Native multimodal from the ground up — processes text, images, audio, and video in a single model rather than routing through separate encoders. The 2M context window makes it especially strong for large-document analysis.