LANGUAGE MODEL Jan 2022 OpenAI

InstructGPT

Training LMs to Follow Instructions with Human Feedback

The training recipe — supervised fine-tuning followed by reinforcement learning from human feedback — that turned GPT-3 from a raw text completer into a model that actually follows instructions. Human raters preferred the 1.3B InstructGPT model over the 175B raw GPT-3 model, suggesting that alignment can matter more than scale for user-facing tasks.

Try ChatGPT API Docs ↗

Context

Up to 2,048 tokens

Official ↗

Why it matters

InstructGPT, not GPT-3, is what people interact with when they use a "language model" today. The recipe in this paper is why the chat box on chat.openai.com is helpful instead of a chaotic autocomplete.

Core Capabilities

Long Documents

Handles entire codebases, books, and multi-doc RAG.

Research

Foundational paper or scientific contribution.

Context Window

2k tokens

short prompt

4k Chat 聊天

32k Long docs 长文档

128k Books 整本书

400k Multi-doc 多文档

1M Codebase 整个代码库

10M

Availability

API

Available

Product / App

Not available

Open Source

Not released

Enterprise

Contact sales

Pricing Model

Pay per token

Input and output billed separately.

Pay-per-token

Capability / Performance

Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).

Lower 20% Upper 80% This model

Context / memory

Context window size · log-scaled

6.0

9.0

0.0

Lower 20% 20th percentile — 20% of models score below this This model Where the current model lands Upper 80% 80th percentile — only 20% of models score above this Percentile boundaries are computed across every model in the tree that reports the underlying benchmark for each capability.

What it feels like

Introduced RLHF (reinforcement learning from human feedback) at scale — recipe behind ChatGPT, Claude, Gemini
1.3B InstructGPT preferred over 175B GPT-3 on user prompts — alignment beat scale
Three-step training: SFT on demos, reward model from comparisons, PPO against reward — still the standard template
Direct precursor of ChatGPT (Nov 2022): same recipe with more conversational tuning
TruthfulQA improvement of +10 to +25 points over base GPT-3 — measurable hallucination reduction
Most cited alignment paper of the GPT era; sparked Constitutional AI and DPO follow-ups

Reviews: OpenAI — Aligning language models to follow instructions ↗ · InstructGPT paper (arXiv) ↗

Best use cases

Foundational reading on RLHF and modern instruction-tuning
Citation in any work on alignment, preference modelling, or reward modelling
Understanding why ChatGPT 'feels different' from raw GPT-3

Tools to try

ChatGPT Codex CLI Cursor GitHub Copilot Continue.dev

Not ideal for

Direct production use — long superseded by ChatGPT and successors
Self-hosted / open-weights workflows (closed API only)

Model Evolution

View full evolution tree →

Ouyang, L. · Wu, J. · Jiang, X. · Almeida, D. · Wainwright, C. · et al.