DeepMind Enables Multi-Agent Drone Racing With Half Crash Rate
Here is a more detailed overview in @davsca1's post. Working with his lab is fantastic and I'm glad to be able to integrate our research lab #nomagic into the Swiss robotics and AI ecosystem.
We are excited to share our latest work, "Superhuman Safe and Agile Racing through Multi-Agent Reinforcement Learning," done in collaboration with @GoogleDeepMind . Autonomous drones have reached superhuman speed in isolation, but what happens when multiple agents share the same airspace? Paper: https://arxiv.org/abs/2605.22748 Website: https://rpg.ifi.uzh.ch/marl Video: https://youtu.be/TSwtrHQgjD8 Using league-based self-play, we train #ReinforcementLearning agents that race against a diverse, evolving population of opponents. Through this competitive training, sophisticated behaviors emerge without explicit programming: strategic overtaking, proactive collision avoidance, and even awareness of aerodynamic downwash from nearby drones. In real-world multi-player races at speeds exceeding 80kph (50 mph) and accelerations up to 7g, our agents outperform a five-time Swiss national drone racing champion while reducing collision rates by 50% compared to single-agent baselines. Crucially, training against diverse artificial opponents enables zero-shot generalization to human pilots, achieving over 90% race completion in mixed human-AI races with up to four competitors. A key insight: human pilots adopt riskier strategies when trailing, leading to more crashes under competitive pressure. Our learned policies, by contrast, maintain consistent safety margins regardless of race standing, a property essential for deploying autonomous systems alongside humans. Also, the multi-agent self-play policies are more robust than those trained independently, suggesting that training in competitive environments is not only key to winning races but also to learning safer, more reliable autonomy for real-world multi-robot systems. Kudos to Ismail Geles, Leonard Bauersfeld, Markus Wulfmeier! @isgeles @l_bauersfeld @m_wulfmeier @ERC_Research @uzh_ifi @UZH_en @UZH_Science @UZHspacehub @swissrobotics @nccrrobotics
Fun bit: In 1v1, if you start ahead of the opponent and don't make any mistakes, you can actually ignore them for drone racing. This becomes harder with more opponents. Imagine a game of chicken where in the 1v1 setting you do really well by using the time optimal path independent of opponent. If they use it as well: both of you lose, and if they don't: you win. But the chance of collisions becomes too big as the number of opponents grows.
Here is a more detailed overview in @davsca1's post. Working with his lab is fantastic and I'm glad to be able to integrate our research lab #nomagic into the Swiss robotics and AI ecosystem.
During the project, I was repeatedly reminded of the recent successes in AI for poker (Moravčíket al and @polynoamial et al) which traversed a similar ladder. Worth a read if the space interests you!
1v1/heads-up https://www.science.org/doi/10.1126/science.aam6960https://www.science.org/doi/10.1126/science.aao1733 Full game https://www.science.org/doi/10.1126/science.aay2400
Fun bit: In 1v1, if you start ahead of the opponent and don't make any mistakes, you can actually ignore them for drone racing. This becomes harder with more opponents. Imagine a game of chicken where in the 1v1 setting you do really well by using the time optimal path independent of opponent. If they use it as well: both of you lose, and if they don't: you win. But the chance of collisions becomes too big as the number of opponents grows.
Page with links to paper and videos https://rpg.ifi.uzh.ch/marl/
During the project, I was repeatedly reminded of the recent successes in AI for poker (Moravčíket al and @polynoamial et al) which traversed a similar ladder. Worth a read if the space interests you! 1v1/heads-up https://www.science.org/doi/10.1126/science.aam6960https://www.science.org/doi/10.1126/science.aao1733 Full game https://www.science.org/doi/10.1126/science.aay2400