LANGUAGE MODEL OpenAI

InstructGPT

Training LMs to Follow Instructions with Human Feedback

The training recipe — supervised fine-tuning followed by reinforcement learning from human feedback — that turned GPT-3 from a raw text completer into a model that actually follows instructions. Human raters preferred the 1.3B InstructGPT model over the 175B raw GPT-3 model, suggesting that alignment can matter more than scale for user-facing tasks.

Context
2K
Up to 2,048 tokens

Why it matters

InstructGPT, not GPT-3, is what people interact with when they use a "language model" today. The recipe in this paper is why the chat box on chat.openai.com is helpful instead of a chaotic autocomplete.

Core Capabilities

Long Documents
Handles entire codebases, books, and multi-doc RAG.
Research
Foundational paper or scientific contribution.

Context Window

2k tokens
short prompt
4k Chat 聊天
32k Long docs 长文档
128k Books 整本书
400k Multi-doc 多文档
1M Codebase 整个代码库
10M
2k

Availability

API
Available
Product / App
Not available
Open Source
Not released
Enterprise
Contact sales

Pricing Model

Pay per token
Input and output billed separately.
Pay-per-token

Capability / Performance

Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).

Lower 20% Upper 80% This model
Context / memory
Context window size · log-scaled
6.0
9.0
0.0
Lower 20% 20th percentile — 20% of models score below this This model Where the current model lands Upper 80% 80th percentile — only 20% of models score above this Percentile boundaries are computed across every model in the tree that reports the underlying benchmark for each capability.

What it feels like

  • Introduced RLHF (reinforcement learning from human feedback) at scale — recipe behind ChatGPT, Claude, Gemini
  • 1.3B InstructGPT preferred over 175B GPT-3 on user prompts — alignment beat scale
  • Three-step training: SFT on demos, reward model from comparisons, PPO against reward — still the standard template
  • Direct precursor of ChatGPT (Nov 2022): same recipe with more conversational tuning
  • TruthfulQA improvement of +10 to +25 points over base GPT-3 — measurable hallucination reduction
  • Most cited alignment paper of the GPT era; sparked Constitutional AI and DPO follow-ups

Best use cases

  • Foundational reading on RLHF and modern instruction-tuning
  • Citation in any work on alignment, preference modelling, or reward modelling
  • Understanding why ChatGPT 'feels different' from raw GPT-3

Tools to try

Not ideal for

  • Direct production use — long superseded by ChatGPT and successors
  • Self-hosted / open-weights workflows (closed API only)

Model Evolution

View full evolution tree →

Ouyang, L. · Wu, J. · Jiang, X. · Almeida, D. · Wainwright, C. · et al.