World Labs Marble
Spatially-Intelligent 3D World Generator
A generative model that produces editable 3D worlds — geometry, materials, lighting — from a text prompt or a reference image. The output drops into existing game engines and VFX pipelines as actual 3D assets, not video frames.
Why it matters
The first commercial release in Fei-Fei Li's "spatial intelligence" framing — the claim that AI's next plateau is 3D-grounded perception and generation, not larger language models. Marble's output is editable 3D, not video, which is the structural distinction from the Sora / Veo line. If the framing holds, this is the seed of an entire parallel branch of generative AI.
Core Capabilities
Generative
Produces images, video, audio, or other media.
Vision
Understands images, scenes, and visual context.
Multimodal
Combines text, vision, and audio in one model.
Context Window
Context window not disclosed.
Availability
API
Available
Product / App
Available
Open Source
Not released
Enterprise
Contact sales
Pricing Model
Pay per token
Input and output billed separately.
Pay-per-token What it feels like
Best use cases
- Game development asset generation (World Labs)
- Visual-effects pre-visualization (World Labs)
Not ideal for
- Fully offline / air-gapped deployment.
- Text-heavy reasoning and coding workloads (use an LLM).