I think this means you can collect ~10k hours just from open source datasets, which means basically anyone should be able to build a decent robot foundation model:
500 from the new BitRobot dataset 500 from galaxea https://huggingface.co/datasets/OpenGalaxea/Galaxea-Open-World-Dataset 3000 from agibot https://huggingface.co/datasets/agibot-world/AgiBotWorld2026 ~3000 from Open X embodiment (though it's mostly pretty bad data) https://robotics-transformer-x.github.io/ 830 from EgoDex https://github.com/apple/ml-egodex ~30 from humanoid-everyday (but good quality) https://huggingface.co/datasets/USC-PSI-Lab/humanoid-everyday 3500 from ABC https://huggingface.co/datasets/XDOF/ABC-130k
500 hours of data 🤯
We’re still far from internet scale, but incredible that researchers are releasing open source data

