LANGUAGE MODEL Jan 2025 DeepSeek Last updated: Apr 29, 2026

DeepSeek R1

Open-Weight Reasoning Model 开放权重的推理模型

A Chinese AI lab released an open-weight reasoning model that approached OpenAI's o1 in quality, with training compute reportedly a fraction of frontier US labs'. Released free with a permissive license. Within a week of release, it triggered a $600B drop in Nvidia's market cap as investors questioned whether closed-model moats and US capex assumptions still held. 一家中国 AI 实验室放出了一款开放权重的推理模型，质量逼近 OpenAI o1，训练算力据称只是美国前沿实验室的一小部分，并采用宽松许可免费发布。发布不到一周，它造成英伟达市值蒸发约 6000 亿美元——投资人开始质疑闭源护城河与美国 AI capex 假设是否还成立。

Try DeepSeek API Docs ↗

Intelligence

Below avg

Cost

Moderate

$1.68 in / $4.70 out

Context

128K

Up to 128,000 tokens

How are Intelligence, Speed & Cost bucketed?

Intelligence and Speed buckets are percentile ranks on Artificial Analysis. Cost buckets are fixed dollar thresholds keyed off output-token price ($/M out).

Intelligence

Top 1%≤ 1%
Top 5%≤ 5%
Top 10%≤ 10%
Good≤ 25%
Medium≤ 50%
Below avg> 50%

Speed

Top 1%≥ 345 tok/s
Top 5%≥ 237 tok/s
Top 10%≥ 196 tok/s
Good≥ 146 tok/s
Medium≥ 90 tok/s
Slow< 90 tok/s

Cost

Freeopen weights · self-host
Low< $1 / M out
Moderate$1–5 / M out
High≥ $5 / M out

Official ↗ GitHub ↗ Artificial Analysis ↗ Hugging Face ↗

Why it matters

R1 is the model that broke the "scale is the moat" investment thesis in public. Whether the technical claims hold up to longer scrutiny or not, the perception shift was permanent — every frontier US lab now has to defend why their training spend is justified given the existence of open competitors at much lower reported cost.

R1 是戳破"规模即护城河"投资信条的那款模型。不管它的技术口径能不能经受住更久的审视，认知层面的转变已经不可逆——每家美国前沿实验室现在都得回答：既然开源阵营以更低成本也能做到，你训练预算的正当性在哪？

Core Capabilities

Long Documents

Handles entire codebases, books, and multi-doc RAG.

Generative

Produces images, video, audio, or other media.

Agent Workflows

Built for tool use and autonomous tasks.

Context Window

128k tokens

≈ 98 pages

4k Chat 聊天

32k Long docs 长文档

128k This model 本模型

400k Multi-doc 多文档

1M Codebase 整个代码库

10M

Availability

API

Not available

Product / App

Not available

Open Source

Released

Enterprise

—

Pricing Model

Free / self-host

Open weights — pay only for compute.

Self-host

Capability / Performance

Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).

Lower 20% Upper 80% This model

Reasoning

AA Intelligence Index · scaled to 10

1.7

5.6

2.7

Coding

SciCode · scaled to 10

1.8

4.3

4.0

Agentic tasks

Terminal-Bench Hard · scaled to 10

0.2

3.6

1.6

Context / memory

Context window size · log-scaled

6.0

9.0

6.0

Cost efficiency

Input price ($/M tokens) · cheaper scores higher

6.2

10.0

6.7

Lower 20% 20th percentile — 20% of models score below this This model Where the current model lands Upper 80% 80th percentile — only 20% of models score above this Percentile boundaries are computed across every model in the tree that reports the underlying benchmark for each capability.

What it feels like

First open-weights reasoning model that genuinely competes with o1 — broke the closed-source moat
~90% on advanced math benchmarks vs ~83% for GPT-4o; the chain-of-thought is fully visible
Trained for ~$5.5M on 2,048 H800s — proof you don't need $100M training runs to reach the frontier
Excellent on math/logic/coding; weaker on broad creative writing and multi-turn personality
Open weights mean you can self-host, fine-tune, distil — no API rate limits
January 2025 release sparked the 'DeepSeek moment' that hit Nvidia stock and reset cost expectations

Reviews: MIT Technology Review — DeepSeek's top AI reasoning model despite sanctions ↗ · BentoML — Complete guide to DeepSeek models V3, R1, V4 ↗ · DataCamp — DeepSeek vs ChatGPT comparison ↗

Best use cases

Math proofs, logic puzzles, and step-by-step derivations where explicit reasoning helps
Coding and engineering tasks that benefit from chain-of-thought
On-prem / air-gapped deployments where API models can't go
Cost-sensitive bulk inference (≈10x cheaper than ChatGPT-class via providers)
Distillation into smaller open models for production

Tools to try

DeepSeek Chat Cursor Hugging Face Ollama

Not ideal for

Casual chat, tone, or creative writing — ChatGPT/Claude feel more polished
Multimodal tasks (image / vision) — text-only model
Latency-sensitive UX — reasoning trace adds significant time