Gemini 3.1 Pro — insidejob

Gemini 3.1 Pro is the strongest all-around model by multiple independent benchmarks — it leads or places top-3 on nearly every major evaluation.

Benchmarks

Benchmark	Score	Rank
SWE-bench Verified	78.8%	Leading (among GA models)
GPQA Diamond	90.8%	#4
ARC-AGI-2	77.1%	#1
LM Arena Elo	1493	#2

Pricing

	Per 1M tokens
Input	$2.00
Output	$12.00

Cheaper than both Claude Opus and GPT-5.4 at the input tier.

Variants

Model	Input/Output (per 1M)	Use case
Gemini 3.1 Pro	$2.00/$12.00	Flagship reasoning
Gemini 3.1 Flash	$0.50/$3.00	Fast, balanced
Gemini 3.1 Flash-Lite	$0.10/$0.40	Ultra-cheap, high volume
Gemini 3.1 Ultra	Premium	Native multimodal reasoning

Strengths

Best price among frontier models ($2/$12)
2M context window — largest of any frontier model
Native multimodal (Ultra variant) — not bolted-on vision
ARC-AGI-2 leader — strongest abstract reasoning

Weaknesses

GPQA trails Claude Opus by 3.5 points
Google’s API ecosystem (Vertex AI) adds complexity vs simpler APIs
Ultra pricing not yet publicly available

Architecture

Native multimodal from the ground up — processes text, images, audio, and video in a single model rather than routing through separate encoders. The 2M context window makes it especially strong for large-document analysis.