4h ago

Oliver Cameron, co-founder and CEO of Odyssey ML, introduced Starchild-1 as the first real-time multimodal world model that generates interactive simulations incorporating synchronized audio and accepts continuous text, speech, and action inputs

0

Starchild-1 uses causal autoregressive prediction across audio and video to support stable long-horizon interactive simulations, advancing toward general-purpose world simulators.

Original post

Introducing Starchild-1 from @odysseyml, the first ever real-time multimodal world model. This a model that can generate interactive simulations of the world that you can—for the first time ever—hear. Starchild-1 represents a big step towards a general-purpose world simulator.

9:40 AM · May 18, 2026 View on X

😍😍😍😍

Oliver CameronOliver Cameron@olivercameron

Introducing Starchild-1 from @odysseyml, the first ever real-time multimodal world model. This a model that can generate interactive simulations of the world that you can—for the first time ever—hear. Starchild-1 represents a big step towards a general-purpose world simulator.

4:40 PM · May 18, 2026 · 131.9K Views
4:51 PM · May 18, 2026 · 1.6K Views

@olivercameron @odysseyml a serious technical step, as audio forces the model to learn hidden physical and social structure that silent video can often fake.

So many possibilities I can think of right away.

Oliver CameronOliver Cameron@olivercameron

Introducing Starchild-1 from @odysseyml, the first ever real-time multimodal world model. This a model that can generate interactive simulations of the world that you can—for the first time ever—hear. Starchild-1 represents a big step towards a general-purpose world simulator.

4:40 PM · May 18, 2026 · 131.9K Views
4:58 PM · May 18, 2026 · 219 Views

Introducing Starchild-1 from @odysseyml, the first ever real-time multimodal world model.

This a model that can generate interactive simulations of the world that you can—for the first time ever—hear.

Starchild-1 represents a big step towards a general-purpose world simulator.

4:40 PM · May 18, 2026 · 131.9K Views

Starchild-1 is an early step beyond world models that learn only from visual observation, toward systems that learn from richer multimodal interaction with the world.

Oliver CameronOliver Cameron@olivercameron

Introducing Starchild-1 from @odysseyml, the first ever real-time multimodal world model. This a model that can generate interactive simulations of the world that you can—for the first time ever—hear. Starchild-1 represents a big step towards a general-purpose world simulator.

4:40 PM · May 18, 2026 · 131.9K Views
4:40 PM · May 18, 2026 · 2.3K Views

My dream is that models like Starchild-1 unlock entirely new forms of education, gaming, companionship, robotics, and brand new computing devices.

We're so early on this journey, but I'm so excited. The team did truly great work here.

odyssey.ml
Starchild-1: The First Real-Time Multimodal World Model
AI to simulate both the visuals and sounds of the world, all in real-time.
Oliver CameronOliver Cameron@olivercameron

Starchild-1 is an early step beyond world models that learn only from visual observation, toward systems that learn from richer multimodal interaction with the world.

4:40 PM · May 18, 2026 · 2.3K Views
4:40 PM · May 18, 2026 · 2.1K Views

We're also releasing an accompanying technical report of Starchild-1, to share our learnings and encourage further research in this direction!

Oliver CameronOliver Cameron@olivercameron

My dream is that models like Starchild-1 unlock entirely new forms of education, gaming, companionship, robotics, and brand new computing devices. We're so early on this journey, but I'm so excited. The team did truly great work here. https://odyssey.ml/introducing-starchild-1

4:40 PM · May 18, 2026 · 2.1K Views
4:40 PM · May 18, 2026 · 1.7K Views
Oliver Cameron, co-founder and CEO of Odyssey ML, introduced Starchild-1 as the first real-time multimodal world model that generates interactive simulations incorporating synchronized audio and accepts continuous text, speech, and action inputs · Digg