3h ago

Webscale Video Pretraining Boosts Robot Dexterity and Sample Efficiency

0
Original post

Robotics fundamentally involves understanding the dynamics of how things change in the world in response to action and force. This is impossible to learn from static images; instead, it’s far more effective and more data-efficient to learn from video. @elvisnavah joins us to talk about @mimicrobotic. One of the key findings from mimic-video is that pretraining on webscale video allows robots to learn physics priors; as a result, policies train faster, generalize better, and are capable of more impressive dexterity, versus training on static images or image-language pairs as per a VLM. Watch Episode #81 of RoboPapers with @micoolcho and @chris_j_paxton to learn more!

6:00 AM · May 20, 2026 View on X
Reposted by

Video action models are data efficient and allow robots to learn complex dexterous tasks. Learn more in our episode with @elvisnavah of @mimicrobotics ->

RoboPapersRoboPapers@RoboPapers

Robotics fundamentally involves understanding the dynamics of how things change in the world in response to action and force. This is impossible to learn from static images; instead, it’s far more effective and more data-efficient to learn from video. @elvisnavah joins us to talk about @mimicrobotic. One of the key findings from mimic-video is that pretraining on webscale video allows robots to learn physics priors; as a result, policies train faster, generalize better, and are capable of more impressive dexterity, versus training on static images or image-language pairs as per a VLM. Watch Episode #81 of RoboPapers with @micoolcho and @chris_j_paxton to learn more!

1:00 PM · May 20, 2026 · 5.9K Views
2:56 PM · May 20, 2026 · 3K Views