IMAGE MODEL World Labs Last updated:

World Labs Marble

Spatially-Intelligent 3D World Generator

A generative model that produces editable 3D worlds — geometry, materials, lighting — from a text prompt or a reference image. The output drops into existing game engines and VFX pipelines as actual 3D assets, not video frames.

Try demo

Why it matters

The first commercial release in Fei-Fei Li's "spatial intelligence" framing — the claim that AI's next plateau is 3D-grounded perception and generation, not larger language models. Marble's output is editable 3D, not video, which is the structural distinction from the Sora / Veo line. If the framing holds, this is the seed of an entire parallel branch of generative AI.

Core Capabilities

Generative
Produces images, video, audio, or other media.
Vision
Understands images, scenes, and visual context.
Multimodal
Combines text, vision, and audio in one model.

Context Window

Context window not disclosed.

Availability

API
Available
Product / App
Available
Open Source
Not released
Enterprise
Contact sales

Pricing Model

Pay per token
Input and output billed separately.
Pay-per-token

What it feels like

Best use cases

  • Game development asset generation (World Labs)
  • Visual-effects pre-visualization (World Labs)

Not ideal for

  • Fully offline / air-gapped deployment.
  • Text-heavy reasoning and coding workloads (use an LLM).

Li, Fei-Fei · Johnson, Justin · Lassner, Christoph · Mildenhall, Ben