Sora
OpenAI's text-to-video model, demonstrated as research previews in February 2024. The viral demo videos — a fashion-shoot in Tokyo, a snow leopard in mountains, vintage California drone footage — convinced the public that AI-generated video was crossing into "could be mistaken for real footage" territory. Held back from public release until December 2024 (Sora 1.0) and meaningfully iterated as Sora 2 in late 2025.
Why it matters
Sora is the inflection point where video joined image, text, audio, and code as a domain where AI-generated content was no longer obviously distinguishable from human-produced content. The implications for advertising, film, content moderation, and disinformation are still being absorbed.
Core Capabilities
Context Window
Context window not disclosed.
Availability
Pricing Model
Capability / Performance
Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).
What it feels like
- Earlier Sora preview — succeeded by Sora 2.
- First Sora generation with native synchronized audio — speech, sound effects, ambient soundscape from one model
- Real physics: missed basketball rebounds off the backboard; objects respect buoyancy and rigidity
- Olympic gymnastics, paddleboard backflips, ice-skating triple axels — motion that prior systems couldn't render
Best use cases
- Short-form social video generation (the Sora app's whole purpose)
- Storyboards / previz where physical accuracy matters more than fine-grained creative control
- Custom-character video using cameo for personalisation
Tools to try
Not ideal for
- Frame-perfect creative control — no full keyframe editing the way Runway / Kling offer
- Long-form (>60s) coherent narrative — best on short clips