Voxtral TTS (v26.03) NEW
Voxtral TTS (v26.03) is an API model from Mistral AI. It’s positioned for audio tasks—work that benefits from iteration, not just one-shot answers.
Core Capabilities
Audio
Speech, music, or other audio understanding/synthesis.
Generative
Produces images, video, audio, or other media.
Context Window
Context window not disclosed.
Availability
API
Available
Product / App
Not available
Open Source
Not released
Enterprise
Contact sales
Pricing Model
Pay per token
Input and output billed separately.
Pay-per-token What it feels like
- Audio model from Mistral AI — see the linked sources below for benchmark and review coverage
- Audio synthesis or transcription per the published model card
Best use cases
- Audio synthesis / transcription tasks per the model card
- See the model spec and sources block for benchmarked use cases
Tools to try
Not ideal for
- Tasks far outside the modalities listed in this model's spec
- Workflows where a more recent successor in the same family scores higher