The robolab leaderboard is interesting -- still fairly noisy (i.e. not the same as other leaderboards like RoboArena or MolmoSpaces). Suggests we're pretty far from a truly general-purpose robotics model, IMO. the data it's trained on is still a huge differentiator.
RoboLab Leaderboard Shows Noisy Results for Open-Source Robotics Models
Most Activity
It is at least cool how we are FINALLY getting somewhat closer to real, useful evaluations in robotics that are somewhat meaningful and reliable -- even if they're still scattered and not THAT informative yet.
The robolab leaderboard is interesting -- still fairly noisy (i.e. not the same as other leaderboards like RoboArena or MolmoSpaces). Suggests we're pretty far from a truly general-purpose robotics model, IMO. the data it's trained on is still a huge differentiator.