AUDIO MODEL Mar 2025 Google/DeepMind Last updated: Apr 29, 2026

Gemini Robotics

VLA from DeepMind

DeepMind's vision-language-action (VLA) family — Gemini Robotics, Robotics-ER, On-Device, and Robotics 1.5/ER 1.5 (Sept 2025). Same Gemini multimodal brain, fine-tuned to output robot motor commands instead of text. Deployed on Boston Dynamics Atlas, Apptronik Apollo, and partner humanoid platforms with multi- embodiment Motion Transfer.

Try Gemini API Docs ↗

Official ↗

Why it matters

Established VLA as a real product category, not just research. Combined with Physical Intelligence π series, NVIDIA GR00T, and Figure Helix, embodied AI is at the same "early commercialization" point that LLMs were in 2022.

Core Capabilities

Agent Workflows

Built for tool use and autonomous tasks.

Multimodal

Combines text, vision, and audio in one model.

Generative

Produces images, video, audio, or other media.

Audio

Speech, music, or other audio understanding/synthesis.

Context Window

Context window not disclosed.

Availability

API

Available

Product / App

Not available

Open Source

Not released

Enterprise

Contact sales

Pricing Model

Pay per token

Input and output billed separately.

Pay-per-token

What it feels like

Audio model from Google DeepMind — see the linked sources below for benchmark and review coverage
Tool-use and agent loops are the typical fit per the published model card
Vision and multimodal tasks are the typical fit per the published model card
Audio synthesis or transcription per the published model card

Best use cases

Agent / tool-use workflows that match the model's published benchmarks
Vision tasks (charts, documents, images) per the model card
Audio synthesis / transcription tasks per the model card
See the model spec and sources block for benchmarked use cases

Tools to try

Gemini app AI Studio Vertex AI

Not ideal for

Tasks far outside the modalities listed in this model's spec
Workflows where a more recent successor in the same family scores higher

Model Evolution

View full evolution tree →