LANGUAGE MODEL Google/DeepMind

Chinchilla

Compute-Optimal LLM Training

A DeepMind study that showed — contrary to two years of received wisdom — that most existing large language models were under-trained. A 70B model trained on 4× more data outperformed a 280B model trained on the same compute. Everyone had been making models too large relative to their data budget.

Context
2K
Up to 2,048 tokens

Why it matters

If the model on your phone is smaller and yet just as capable as GPT-3 was five years ago, you are benefiting from Chinchilla. Every efficient small model (phi, Mistral 7B, Gemma) is a Chinchilla descendant.

Core Capabilities

Research
Foundational paper or scientific contribution.
Long Documents
Handles entire codebases, books, and multi-doc RAG.

Context Window

2k tokens
short prompt
4k Chat 聊天
32k Long docs 长文档
128k Books 整本书
400k Multi-doc 多文档
1M Codebase 整个代码库
10M
2k

Availability

API
Not available
Product / App
Not available
Open Source
Not released
Enterprise

Pricing Model

Research artifact
Not commercially released.
Research

Capability / Performance

Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).

Lower 20% Upper 80% This model
Context / memory
Context window size · log-scaled
6.0
9.0
0.0
Lower 20% 20th percentile — 20% of models score below this This model Where the current model lands Upper 80% 80th percentile — only 20% of models score above this Percentile boundaries are computed across every model in the tree that reports the underlying benchmark for each capability.

What it feels like

  • Language model from DeepMind — see the linked sources below for benchmark and review coverage

Best use cases

  • General-purpose tasks within DeepMind's deployment footprint
  • See the model spec and sources block for benchmarked use cases

Tools to try

Not ideal for

  • Tasks far outside the modalities listed in this model's spec
  • Workflows where a more recent successor in the same family scores higher

Hoffmann, J. · Borgeaud, S. · Mensch, A. · et al.