3h ago

DeepMind Enables Multi-Agent Drone Racing With Half Crash Rate

45121.1K

——0——

Original post

The world is highly multi-agent - that counts for AI and robots as well as for anyone else. Expanding existing solutions to apply AI in single-agent (or 1v1) to multi-opponent always is always a significant leap in complexity. We now enable this transition in the highly demanding domain of drone racing, high-frequency control at speeds of 80 km/h and up to 7g acceleration! In pure 1v1, achieving superhuman performance means being time-optimal and largely ignoring the opponent - this is the foundation of previous key successes. Now multi-agent racing requires a new paradigm for safety and superhuman agility. To achieve this, we need to train with realistic competitors. Just as targeted randomisation has allowed us to bridge the sim-to-real gap for physics and visuals, we can generate highly realistic opponent behaviour in simulation. We use a variant of league play to successfully transfer these dynamic, multi-agent policies to the physical world. Many other advances in the preprint! The result: Our solution reduces crash rates by half and wins against the 5x Swiss national champion. An incredible collaboration paving the way for high-performant AND safe robotics solutions in our inherently multi-agent world between @GoogleDeepMind and @davsca1's team at @UZH_en led by @isgeles and with @l_bauersfeld. More useful insights and resources in the thread! #multiagent #racing #robots #reinforcementlearning

7:51 AM · May 28, 2026

QUOTE POST

#689Markus Wulfmeier@M_WULFMEIER

Here is a more detailed overview in @davsca1's post. Working with his lab is fantastic and I'm glad to be able to integrate our research lab #nomagic into the Swiss robotics and AI ecosystem.

Davide Scaramuzza@davsca1

We are excited to share our latest work, "Superhuman Safe and Agile Racing through Multi-Agent Reinforcement Learning," done in collaboration with @GoogleDeepMind . Autonomous drones have reached superhuman speed in isolation, but what happens when multiple agents share the same airspace? Paper: https://arxiv.org/abs/2605.22748 Website: https://rpg.ifi.uzh.ch/marl Video: https://youtu.be/TSwtrHQgjD8 Using league-based self-play, we train #ReinforcementLearning agents that race against a diverse, evolving population of opponents. Through this competitive training, sophisticated behaviors emerge without explicit programming: strategic overtaking, proactive collision avoidance, and even awareness of aerodynamic downwash from nearby drones. In real-world multi-player races at speeds exceeding 80kph (50 mph) and accelerations up to 7g, our agents outperform a five-time Swiss national drone racing champion while reducing collision rates by 50% compared to single-agent baselines. Crucially, training against diverse artificial opponents enables zero-shot generalization to human pilots, achieving over 90% race completion in mixed human-AI races with up to four competitors. A key insight: human pilots adopt riskier strategies when trailing, leading to more crashes under competitive pressure. Our learned policies, by contrast, maintain consistent safety margins regardless of race standing, a property essential for deploying autonomous systems alongside humans. Also, the multi-agent self-play policies are more robust than those trained independently, suggesting that training in competitive environments is not only key to winning races but also to learning safer, more reliable autonomy for real-world multi-robot systems. Kudos to Ismail Geles, Leonard Bauersfeld, Markus Wulfmeier! @isgeles @l_bauersfeld @m_wulfmeier @ERC_Research @uzh_ifi @UZH_en @UZH_Science @UZHspacehub @swissrobotics @nccrrobotics

4:00 PM · May 26, 2026 · 11.3K Views

2:51 PM · May 28, 2026 · 412 Views

#689Markus Wulfmeier@M_WULFMEIER

Fun bit: In 1v1, if you start ahead of the opponent and don't make any mistakes, you can actually ignore them for drone racing. This becomes harder with more opponents. Imagine a game of chicken where in the 1v1 setting you do really well by using the time optimal path independent of opponent. If they use it as well: both of you lose, and if they don't: you win. But the chance of collisions becomes too big as the number of opponents grows.

Markus Wulfmeier@m_wulfmeier

Here is a more detailed overview in @davsca1's post. Working with his lab is fantastic and I'm glad to be able to integrate our research lab #nomagic into the Swiss robotics and AI ecosystem.

2:51 PM · May 28, 2026 · 412 Views

2:51 PM · May 28, 2026 · 94 Views

#689Markus Wulfmeier@M_WULFMEIER

1v1/heads-up https://www.science.org/doi/10.1126/science.aam6960https://www.science.org/doi/10.1126/science.aao1733 Full game https://www.science.org/doi/10.1126/science.aay2400

Markus Wulfmeier@m_wulfmeier

2:51 PM · May 28, 2026 · 94 Views

2:51 PM · May 28, 2026 · 72 Views

#689Markus Wulfmeier@M_WULFMEIER

Page with links to paper and videos https://rpg.ifi.uzh.ch/marl/

Markus Wulfmeier@m_wulfmeier

During the project, I was repeatedly reminded of the recent successes in AI for poker (Moravčíket al and @polynoamial et al) which traversed a similar ladder. Worth a read if the space interests you! 1v1/heads-up https://www.science.org/doi/10.1126/science.aam6960https://www.science.org/doi/10.1126/science.aao1733 Full game https://www.science.org/doi/10.1126/science.aay2400

2:51 PM · May 28, 2026 · 72 Views

2:51 PM · May 28, 2026 · 67 Views

DeepMind Enables Multi-Agent Drone Racing With Half Crash Rate

Sentiment

Cluster engagement