GPT-4
OpenAI's successor to GPT-3.5, launched as an API-gated upgrade to ChatGPT Plus subscribers. Demonstrated a step-change in reasoning benchmarks — bar exam, medical licensing, GRE, Olympiad math — and added image input capability (text output only). OpenAI declined to disclose model size or architecture, citing competitive and safety reasons.
How are Intelligence, Speed & Cost bucketed?
- Top 1%≤ 1%
- Top 5%≤ 5%
- Top 10%≤ 10%
- Good≤ 25%
- Medium≤ 50%
- Below avg> 50%
- Top 1%≥ 345 tok/s
- Top 5%≥ 237 tok/s
- Top 10%≥ 196 tok/s
- Good≥ 146 tok/s
- Medium≥ 90 tok/s
- Slow< 90 tok/s
- Freeopen weights · self-host
- Low< $1 / M out
- Moderate$1–5 / M out
- High≥ $5 / M out
Why it matters
GPT-4 is what people actually mean when they say "AI" in a business conversation in 2024-25. The reasoning, bar-exam, code-generation, and image-understanding demos from GPT-4's launch are the reference point that every subsequent model is implicitly benchmarked against in casual discourse.
Core Capabilities
Context Window
Availability
Pricing Model
Capability / Performance
Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).
What it feels like
- Step-change in reasoning vs GPT-3.5 — top 10% on simulated bar exam vs 3.5's bottom 10%
- MMLU 86.4% in English; surpassed prior models in 24 of 26 other languages
- First widely-deployed model with image input (text output only) — multimodal era starts here
- 19 percentage points fewer hallucinations than GPT-3.5 on adversarial factuality tests
- Genuinely creative and reliable on nuanced instructions where 3.5 broke
- OpenAI declined to publish parameter count or training details — closed-source standard set here
Best use cases
- Professional knowledge work needing top-of-class reasoning at the time
- Code generation with chain-of-thought prompting
- Multilingual tasks across 26+ languages
- Image-input multimodal workflows once GPT-4V landed
Tools to try
Not ideal for
- Frontier work after Claude 3.5+ / GPT-4o / Llama 3 — quickly surpassed in 2024
- Cost-sensitive bulk inference — pricing dominant before GPT-4o cut it in half
- Self-hosted deployments — closed weights
Model Evolution
GPT is OpenAI's language model family.