EMBEDDING MODEL Apr 2025 Cohere Last updated: Apr 29, 2026

Cohere Embed v4

128K Context Multimodal Embeddings

Cohere's April 2025 embedding model — 128K context window (the longest available), multilingual across 100+ languages, with native support for binary quantization that shrinks vector size 10× without significant retrieval quality loss. Top of MTEB at 66.3 average score.

Try Coral API Docs ↗

Context

128K

Up to 128,000 tokens

Official ↗

Why it matters

Pushed embedding context windows from ~8K (OpenAI text-embedding-3) to 128K — meaning entire chapters / contracts / cases can be embedded as single vectors instead of split-and-aggregate. Changes RAG architecture for long-document use cases.

Core Capabilities

Long Documents

Handles entire codebases, books, and multi-doc RAG.

Multimodal

Combines text, vision, and audio in one model.

Research

Foundational paper or scientific contribution.

Vision

Understands images, scenes, and visual context.

Context Window

128k tokens

≈ 98 pages

4k Chat 聊天

32k Long docs 长文档

128k This model 本模型

400k Multi-doc 多文档

1M Codebase 整个代码库

10M

Availability

API

Available

Product / App

Not available

Open Source

Not released

Enterprise

Contact sales

Pricing Model

Pay per token

Input and output billed separately.

Pay-per-token

What it feels like

Vision-language model from Cohere — see the linked sources below for benchmark and review coverage
Vision and multimodal tasks are the typical fit per the published model card

Best use cases

Vision tasks (charts, documents, images) per the model card
See the model spec and sources block for benchmarked use cases

Tools to try

Cohere Coral Cohere Playground

Not ideal for

Tasks far outside the modalities listed in this model's spec
Workflows where a more recent successor in the same family scores higher