World models are increasingly central to how agents learn and plan.
Today we're releasing WorldModelGym, a benchmark built around a single question: if an agent uses a world model to choose among actions, does it pick the right one?
We call this decision-based fidelity. 100+ tracks across Atari, Meta-World, DeepMind Control, and classic control. One frozen policy. Reality scores it.
Read the full post → https://reka.ai/labs/research/worldmodelgym