𝗥𝗼𝗯𝗼𝘁𝘀 𝗱𝗼𝗻’𝘁 𝗻𝗲𝗲𝗱 𝗺𝗼𝗿𝗲 𝗱𝗲𝗺𝗼𝗻𝘀𝘁𝗿𝗮𝘁𝗶𝗼𝗻𝘀. 𝗧𝗵𝗲𝘆 𝗻𝗲𝗲𝗱 𝘁𝗼 𝗹𝗲𝗮𝗿𝗻 𝗳𝗿𝗼𝗺 𝗳𝗮𝗶𝗹𝘂𝗿𝗲 — 𝗮𝗳𝘁𝗲𝗿 𝘄𝗮𝘁𝗰𝗵𝗶𝗻𝗴 𝗵𝘂𝗺𝗮𝗻𝘀.
Most robot learning systems assume failure is the end of learning.
In our new work, we study whether robots can improve after deployment by learning from their own failures, without any human intervention, teleoperation, or corrective labels.
The key idea is simple: human videos contain structure about how the world works. We use them to learn cross-embodiment representations of action, dynamics, and value, enabling a shared predictive space between human behavior and robot experience. This allows a new learning loop:
👉 pretrain on human videos
👉 deploy robot policy
👉 observe failures
👉 reinterpret failures using human priors
👉 improve autonomously
We evaluate this across 7 real-world manipulation tasks, showing:
📈 40% → 81% success rate
🏆 Strong improvements over π0.6 RECAP and RISE
✔️ Zero human intervention during post-deployment improvement
🧬 Generalizes across robot embodiments and policy backbones
A key finding is that explicit failure repair significantly outperforms failure reweighting, yielding substantially larger gains under identical data conditions (+25 pts vs +5 pts on the same π0.5 base policy).
Overall, the results suggest a shift in how we think about robot learning:
Human videos are not only for pretraining policies. They can provide the structure needed for continual self-improvement after deployment.
📄 Paper: https://arxiv.org/pdf/2606.21406
🌐 Project: https://ethz-mrl.github.io/robot-self-improvement-website/
I am grateful for working with the fantastic leads @hanzhic678 and @Anran_zh, and our collaborators Simon Schaefer, Kejia Chen, Shi Chen, Daniel Cremers. Special thanks to @StefanLeuteneg1 for co-advising this project with me.
@ETH @TU_Muenchen @Microsoft
Check out Hanzhi's 🧵 for more details