GPT-3
A language model that became dramatically more capable than its predecessors simply by being roughly 100 times larger and trained on roughly 100 times more text. It demonstrated that you could often get a model to do a new task by just describing the task in plain English and showing a few examples — no retraining required.
How are Intelligence, Speed & Cost bucketed?
- Top 1%≤ 1%
- Top 5%≤ 5%
- Top 10%≤ 10%
- Good≤ 25%
- Medium≤ 50%
- Below avg> 50%
- Top 1%≥ 345 tok/s
- Top 5%≥ 237 tok/s
- Top 10%≥ 196 tok/s
- Good≥ 146 tok/s
- Medium≥ 90 tok/s
- Slow< 90 tok/s
- Freeopen weights · self-host
- Low< $1 / M out
- Moderate$1–5 / M out
- High≥ $5 / M out
Why it matters
GPT-3 is the moment "large language model" became a coherent product category. Every API-priced LLM business — OpenAI, Anthropic, Cohere, Mistral, the API arms of Google and Meta — uses GPT-3's pricing model (per-token), interface model (text-in, text-out), and capability framing (few-shot prompting). Without this paper, the 2022–2026 AI investment cycle does not happen on the same timeline.
Core Capabilities
Context Window
Availability
Pricing Model
Capability / Performance
Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).
What it feels like
- 175B parameters — 100x larger than GPT-2; the moment scale-as-progress became conventional wisdom
- Few-shot in-context learning emerged as a property of scale — pattern that reframed the whole field
- Released as the paid OpenAI API in mid-2020 — first commercial frontier-tier LLM
- Inspired the Codex spinoff, then GitHub Copilot, then the entire AI-coding-tools wave
- Foundation for InstructGPT (RLHF) and ChatGPT — the productisation that broke out of research
- By 2025 measurements, GPT-3 trails on every benchmark — historical interest only
Best use cases
- Reading the seminal scaling paper to understand modern LLM emergence
- Few-shot prompting research — GPT-3 popularised the technique
- Citation in any work on scaling laws, in-context learning, or LLM history
Tools to try
Not ideal for
- Production deployment in 2025 — every successor is cheaper and better
- Reasoning, coding, or tool-use workloads — modern Sonnet/Haiku/GPT-4o dominate