Kimi Chat
A Beijing startup founded in March 2023 by ex-Tsinghua / ex-Google researchers; their consumer chat product Kimi launched in October 2023 with a 200k-token context window — the longest in the Chinese market at the time. The product's "upload an entire book / annual report and ask anything" framing made it the consumer breakout among Chinese AI assistants in 2024. Expanded to 2M tokens in March 2024.
Why it matters
Kimi made "long context" a marketing dimension that Chinese consumers shopped on — analogous to how "fast charging" became a smartphone marketing dimension in China before it was elsewhere. Long-context as a productization choice (not just a benchmark number) became the operating playbook.
Core Capabilities
Context Window
Availability
Pricing Model
Capability / Performance
Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).
What it feels like
- 1T-parameter MoE / 32B active per token — Moonshot's open-weights debut, modified MIT license
- 65.8% on SWE-bench Verified single-attempt — outperforms every model tested except Claude Sonnet 4 at release
- 53.7% on LiveCodeBench v6 — strong open-source coding tier
Best use cases
- Self-hosted agent platforms where API models can't go (regulated, private cloud)
- Cost-sensitive frontier-tier inference via budget providers
- Coding agents that need long context + open weights for fine-tuning
Tools to try
Not ideal for
- Edge / single-GPU deployments — 1T MoE still demands multi-node serving
- Multimodal tasks (text-only at this generation; vision lives in Kimi-VL)