Perplexity Sonar
Perplexity AI's family of LLM variants explicitly tuned for "answer this question and cite your sources." Released as paid API endpoints (Sonar Small, Large, Huge) in 2024. The underlying technique — Llama 3.1 fine-tuned for citation quality and integrated tightly with Perplexity's web search backend — became the default architecture behind any consumer "AI search" product.
How are Intelligence, Speed & Cost bucketed?
- Top 1%≤ 1%
- Top 5%≤ 5%
- Top 10%≤ 10%
- Good≤ 25%
- Medium≤ 50%
- Below avg> 50%
- Top 1%≥ 345 tok/s
- Top 5%≥ 237 tok/s
- Top 10%≥ 196 tok/s
- Good≥ 146 tok/s
- Medium≥ 90 tok/s
- Slow< 90 tok/s
- Freeopen weights · self-host
- Low< $1 / M out
- Moderate$1–5 / M out
- High≥ $5 / M out
Why it matters
Perplexity Sonar represents the "search-grounded LLM" as a distinct model category — a specialization that generic LLMs (GPT, Claude) approach but don't own as a primary product surface. Whether AI search remains a separate category or gets absorbed by general-purpose chat is the open question.
Core Capabilities
Context Window
Availability
Pricing Model
Capability / Performance
Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).
What it feels like
- Language model from Perplexity — see the linked sources below for benchmark and review coverage
- Tool-use and agent loops are the typical fit per the published model card
Best use cases
- Agent / tool-use workflows that match the model's published benchmarks
- See the model spec and sources block for benchmarked use cases
Tools to try
Not ideal for
- Tasks far outside the modalities listed in this model's spec
- Workflows where a more recent successor in the same family scores higher