LANGUAGE MODEL Alibaba Last updated:

Qwen 3

Toggleable Thinking, 119 Languages 可切换思考模式,119 语种

Alibaba's open-weight Qwen 3 family covers everything from a tiny 0.6B model to a 235B mixture-of-experts. Every size has a switch that turns "thinking" mode on or off — same weights, two behaviors. Speaks 119 languages and was the first big open release to match DeepSeek-R1 on reasoning benchmarks. 阿里开放权重的 Qwen 3 家族,规格从 0.6B 一路开到 235B 的 MoE 全覆盖。每个规格都内置"思考模式"开关——同一份 权重,两套行为。支持 119 种语言,也是第一个在推理基准 上匹敌 DeepSeek-R1 的大型开源版本。

Intelligence
Medium
Speed
Slow
86 tok/s output
Cost
Low
$0.08 in / $0.29 out
Context
1M
Up to 1,000,000 tokens
How are Intelligence, Speed & Cost bucketed?
Intelligence and Speed buckets are percentile ranks on Artificial Analysis. Cost buckets are fixed dollar thresholds keyed off output-token price ($/M out).
Intelligence
  • Top 1%≤ 1%
  • Top 5%≤ 5%
  • Top 10%≤ 10%
  • Good≤ 25%
  • Medium≤ 50%
  • Below avg> 50%
Speed
  • Top 1%≥ 345 tok/s
  • Top 5%≥ 237 tok/s
  • Top 10%≥ 196 tok/s
  • Good≥ 146 tok/s
  • Medium≥ 90 tok/s
  • Slow< 90 tok/s
Cost
  • Freeopen weights · self-host
  • Low< $1 / M out
  • Moderate$1–5 / M out
  • High≥ $5 / M out

Why it matters

Demonstrated that the open-weight Chinese ecosystem could match Western closed reasoning models within months of o3 and R1, on permissive licenses, across the entire size spectrum from edge (0.6B) to frontier (235B MoE).

证明了开源中文生态在 o3 和 R1 发布几个月内就能跟上 西方闭源推理模型——还顶着宽松许可,从 0.6B 边缘端到 235B MoE 前沿端全尺寸段都不缺席。

Core Capabilities

Long Documents
Handles entire codebases, books, and multi-doc RAG.
Multimodal
Combines text, vision, and audio in one model.
Generative
Produces images, video, audio, or other media.
Agent Workflows
Built for tool use and autonomous tasks.

Context Window

1M tokens
≈ entire codebase
4k Chat 聊天
32k Long docs 长文档
128k Books 整本书
400k Multi-doc 多文档
1M This model 本模型
10M

Availability

API
Available
Product / App
Available
Open Source
Released
Enterprise
Contact sales

Pricing Model

Free / self-host
Open weights — pay only for compute.
Self-host

Capability / Performance

Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).

Lower 20% Upper 80% This model
Reasoning
AA Intelligence Index · scaled to 10
1.7
5.6
2.9
Coding
SciCode · scaled to 10
1.8
4.3
3.1
Agentic tasks
Terminal-Bench Hard · scaled to 10
0.2
3.6
0.8
Context / memory
Context window size · log-scaled
6.0
9.0
9.0
Cost efficiency
Input price ($/M tokens) · cheaper scores higher
6.2
10.0
10.0
Lower 20% 20th percentile — 20% of models score below this This model Where the current model lands Upper 80% 80th percentile — only 20% of models score above this Percentile boundaries are computed across every model in the tree that reports the underlying benchmark for each capability.

What it feels like

  • Best open-source reasoning model at its release — 235B-A22B (Thinking) beats DeepSeek-R1 on 17/23 benchmarks
  • Toggle-able thinking mode: same weights serve both reasoning and fast-chat modes
  • Strong 119-language coverage; the most genuinely multilingual frontier-tier model
  • Coder variant reaches 77.2% on SWE-bench Verified — competitive with Claude 4.5 Opus's 80.9%
  • GPQA Diamond 87.8% and AIME26 94.1% — frontier reasoning at open-weights pricing
  • Apache-2.0 license + 1M-context coder variant make it production-ready, not a research toy

Best use cases

  • Multilingual production deployments (119 languages) where most models stay English-centric
  • Self-hosted reasoning workflows that need both 'fast mode' and 'thinking mode' from one weight set
  • Open-weights agentic coding (Qwen3-Coder) with very large context windows
  • Cost-sensitive bulk reasoning that would be prohibitive on closed APIs

Tools to try

Not ideal for

  • Multimodal tasks (Qwen3 base is text — vision lives in Qwen3-VL, audio in Qwen3-Audio)
  • Edge / single-consumer-GPU deployments at the 235B scale
  • Workflows where Western-platform compatibility is a contractual requirement

Model Evolution

qwen is Alibaba's language model family.

View full evolution tree →