LTX-2
Lightricks' Open Video + Audio Model
Lightricks (the Israeli company behind Photoleap) released LTX-2 in January 2026 — 19B parameters total, including a 5B audio model that generates synchronized sound on top of 4K/50fps video. Apache 2.0 licensing makes it the most permissive open video+audio stack available.
Cost
Free
Open weights — self-host
How are Intelligence, Speed & Cost bucketed?
Intelligence and Speed buckets are percentile ranks on
Artificial Analysis. Cost buckets are fixed dollar
thresholds keyed off output-token price ($/M out).
Intelligence
- Top 1%≤ 1%
- Top 5%≤ 5%
- Top 10%≤ 10%
- Good≤ 25%
- Medium≤ 50%
- Below avg> 50%
Speed
- Top 1%≥ 345 tok/s
- Top 5%≥ 237 tok/s
- Top 10%≥ 196 tok/s
- Good≥ 146 tok/s
- Medium≥ 90 tok/s
- Slow< 90 tok/s
Cost
- Freeopen weights · self-host
- Low< $1 / M out
- Moderate$1–5 / M out
- High≥ $5 / M out
Why it matters
First open-weight video+audio integrated stack. Sets the bar for "open Sora 2 alternative" — closed competitors were already shipping synchronized audio (Sora 2, Veo 3, Kling 3); LTX-2 made the same combination open.
Core Capabilities
Generative
Produces images, video, audio, or other media.
Multimodal
Combines text, vision, and audio in one model.
Vision
Understands images, scenes, and visual context.
Context Window
Context window not disclosed.
Availability
API
Available
Product / App
Not available
Open Source
Released
Enterprise
Contact sales
Pricing Model
Free / self-host
Open weights — pay only for compute.
Self-host What it feels like
- Vision-language model from Lightricks — see the linked sources below for benchmark and review coverage
- Vision and multimodal tasks are the typical fit per the published model card
Best use cases
- Vision tasks (charts, documents, images) per the model card
- See the model spec and sources block for benchmarked use cases
Not ideal for
- Tasks far outside the modalities listed in this model's spec
- Workflows where a more recent successor in the same family scores higher