/Tech2h ago

Builder Urges Focus on Simulators for RL Training Optimization

2800557

#487

Original post

kache@yacineMTB#487inTech

@tydsh Please point this at simulators and optimizing them for RL training loops

Yuandong Tian@tydsh

Early results from Recursive 🚀🚀

SotA results from our open-ended knowledge discovery system:

1️⃣NanoChat 5min pre-training (0.9372 bpb -> 0.9109 bpb, 2.8% lower Bits-Per-Byte than long-standing community SoTA)

2️⃣NanoGPT SpeedRun (79.7s -> 77.5s, 2.8% faster than long-standing community SoTA)

3️⃣GPU kernel optimization (Overall 7.8% better than SoTA performance in SOL- ExecBench, hosted by NVIDIA)

To achieve that, our system automatically finds and combines innovations together to find better solutions than current ones carefully designed by expert humans in various domains.

We have open-sourced resulting artifacts found by our system so you can check the output yourself. See a full breakdown and technical writeup:

https://www.recursive.com/articles/first-steps-toward-automated-ai-research

6:21 AM · Jun 11, 2026 · 273 Views

/Tech2h ago

Builder Urges Focus on Simulators for RL Training Optimization

2800557

#487

Original post

kache@yacineMTB#487inTech

@tydsh Please point this at simulators and optimizing them for RL training loops

Yuandong Tian@tydsh

Early results from Recursive 🚀🚀

SotA results from our open-ended knowledge discovery system:

1️⃣NanoChat 5min pre-training (0.9372 bpb -> 0.9109 bpb, 2.8% lower Bits-Per-Byte than long-standing community SoTA)

2️⃣NanoGPT SpeedRun (79.7s -> 77.5s, 2.8% faster than long-standing community SoTA)

3️⃣GPU kernel optimization (Overall 7.8% better than SoTA performance in SOL- ExecBench, hosted by NVIDIA)

To achieve that, our system automatically finds and combines innovations together to find better solutions than current ones carefully designed by expert humans in various domains.

We have open-sourced resulting artifacts found by our system so you can check the output yourself. See a full breakdown and technical writeup:

https://www.recursive.com/articles/first-steps-toward-automated-ai-research

6:21 AM · Jun 11, 2026 · 273 Views

Sentiment

Users are excited about focusing on simulators for RL training optimization because it could enable training MuJoCo 10x or even 100x faster.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS247LIKES3REPLIES1

kache@yacineMTB

@tydsh Imagine if you could train mujoco 10x faster.. 100x faster

kache@yacineMTB

@tydsh Please point this at simulators and optimizing them for RL training loops

2h24730

kache@yacineMTB

@tydsh The bottleneck might not be the neural nets right now :)

kache@yacineMTB

@tydsh Imagine if you could train mujoco 10x faster.. 100x faster

2h3800