EMBEDDING MODEL Jan 2024 BAAI Last updated: Apr 29, 2026

BGE-M3

Open Multilingual Multifunction Embeddings

Beijing Academy of AI's January 2024 embedding model — supports 100+ languages and produces dense, sparse, AND multi-vector outputs from a single forward pass. The de facto open embedding model for multilingual RAG; widely deployed and forked.

Try demo

Cost

Free

Open weights — self-host

Context

Up to 8,192 tokens

How are Intelligence, Speed & Cost bucketed?

Intelligence and Speed buckets are percentile ranks on Artificial Analysis. Cost buckets are fixed dollar thresholds keyed off output-token price ($/M out).

Intelligence

Top 1%≤ 1%
Top 5%≤ 5%
Top 10%≤ 10%
Good≤ 25%
Medium≤ 50%
Below avg> 50%

Speed

Top 1%≥ 345 tok/s
Top 5%≥ 237 tok/s
Top 10%≥ 196 tok/s
Good≥ 146 tok/s
Medium≥ 90 tok/s
Slow< 90 tok/s

Cost

Freeopen weights · self-host
Low< $1 / M out
Moderate$1–5 / M out
High≥ $5 / M out

Official ↗ GitHub ↗

Why it matters

Made high-quality multilingual embeddings free and self-hostable. Combined with BGE-Reranker v2-m3, the BGE stack is the open RAG backbone for languages other than English.

Core Capabilities

Long Documents

Handles entire codebases, books, and multi-doc RAG.

Research

Foundational paper or scientific contribution.

Context Window

8k tokens

≈ short doc

4k Chat 聊天

32k Long docs 长文档

128k Books 整本书

400k Multi-doc 多文档

1M Codebase 整个代码库

10M

Availability

API

Not available

Product / App

Not available

Open Source

Released

Enterprise

—

Pricing Model

Free / self-host

Open weights — pay only for compute.

Self-host

What it feels like

Language model from BAAI — see the linked sources below for benchmark and review coverage

Best use cases

General-purpose tasks within BAAI's deployment footprint
See the model spec and sources block for benchmarked use cases

Not ideal for

Tasks far outside the modalities listed in this model's spec
Workflows where a more recent successor in the same family scores higher