VIDEO MODEL Feb 2024 OpenAI Last updated: Apr 29, 2026

Sora

OpenAI's Text-to-Video

OpenAI's text-to-video model, demonstrated as research previews in February 2024. The viral demo videos — a fashion-shoot in Tokyo, a snow leopard in mountains, vintage California drone footage — convinced the public that AI-generated video was crossing into "could be mistaken for real footage" territory. Held back from public release until December 2024 (Sora 1.0) and meaningfully iterated as Sora 2 in late 2025.

Try ChatGPT API Docs ↗

Official ↗

Why it matters

Sora is the inflection point where video joined image, text, audio, and code as a domain where AI-generated content was no longer obviously distinguishable from human-produced content. The implications for advertising, film, content moderation, and disinformation are still being absorbed.

Core Capabilities

Generative

Produces images, video, audio, or other media.

Multimodal

Combines text, vision, and audio in one model.

Context Window

Context window not disclosed.

Availability

API

Not available

Product / App

Not available

Open Source

Not released

Enterprise

—

Pricing Model

Demo access

Limited / waitlisted.

Demo

Capability / Performance

Where this model sits relative to the middle 60% of models in the tree. All scores are 0–10 (higher is better).

Lower 20% Upper 80% This model

Quality

No data reported · placeholder

5.0

Speed

No data reported · placeholder

5.0

Control

No data reported · placeholder

5.0

Consistency

No data reported · placeholder

5.0

Lower 20% 20th percentile — 20% of models score below this This model Where the current model lands Upper 80% 80th percentile — only 20% of models score above this Percentile boundaries are computed across every model in the tree that reports the underlying benchmark for each capability.

What it feels like

Earlier Sora preview — succeeded by Sora 2.
First Sora generation with native synchronized audio — speech, sound effects, ambient soundscape from one model
Real physics: missed basketball rebounds off the backboard; objects respect buoyancy and rigidity
Olympic gymnastics, paddleboard backflips, ice-skating triple axels — motion that prior systems couldn't render

Reviews: OpenAI — Sora 2 announcement ↗ · Wikipedia — Sora text-to-video model ↗ · Cybernews — Sora 2 review ↗

Best use cases

Short-form social video generation (the Sora app's whole purpose)
Storyboards / previz where physical accuracy matters more than fine-grained creative control
Custom-character video using cameo for personalisation

Tools to try

ChatGPT Codex CLI Cursor GitHub Copilot Continue.dev

Not ideal for

Frame-perfect creative control — no full keyframe editing the way Runway / Kling offer
Long-form (>60s) coherent narrative — best on short clips

Model Evolution

View full evolution tree →