/Tech6h ago

Builder Runs MuJoCo RL Training Loop For Unitree Robot At 200k SPS In Browser

42574118034.6K

Original post

kache@yacineMTB#487inTech

I've gotten a mujoco sim RL training loop for a unitree robot at 200k SPS. I'm looking into the physics for friction, contact dynamics. My goal: can I reproduce & beat the mujoco playground RL baselines

This is running in my web browser with raylib. Its the baseline

6:54 AM · Jun 10, 2026 · 19.3K Views

/Tech6h ago

Builder Runs MuJoCo RL Training Loop For Unitree Robot At 200k SPS In Browser

42574118034.6K

#487

Original post

kache@yacineMTB#487inTech

This is running in my web browser with raylib. Its the baseline

6:54 AM · Jun 10, 2026 · 19.3K Views

Sentiment

Positive users praise the browser MuJoCo RL demo for Unitree robots and its clear analogies, while negative users criticize the robot's awkward movement and voice distrust of related AI tools like Fable.

Pos

62.5%

Neg

37.5%

24 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS10.6KBOOKMARKS29LIKES245RETWEETS3REPLIES14

kache@yacineMTB

I unfortunately can't use any anthropic models for this, because there is no way for me to know whether I'm silently getting nerfed & sandbagged. Not allowed to make the world a better place

Not enough that they're closed, open source devs need to be slowed down too

kache@yacineMTB

This is running in my web browser with raylib. Its the baseline

6h10.6K24529

kache@yacineMTB

My suspicion is that the contact dynamics, code, is not at all optimized for ultra fast RL. The only reason it is as slow as it is is because no one really tried, and I'm going to try

This is all going to be open source. It's a project. My goal is to accelerate robotics

kache@yacineMTB

This is running in my web browser with raylib. Its the baseline

6h2.9K761

kache@yacineMTB

The problem with the "frontier" distinction is that it is laughably easy to be frontier - there are so many green fields not yet explored. It's like saying you're at the frontier in the middle of kansas. There's fertile ground everywhere.

kache@yacineMTB

I unfortunately can't use any anthropic models for this, because there is no way for me to know whether I'm silently getting nerfed & sandbagged. Not allowed to make the world a better place

Not enough that they're closed, open source devs need to be slowed down too

6h1.8K320

kache@yacineMTB

@ChaseBrowe32432 silently giving people horse shit information is poisoning the well for everyone. why don't they make it clearly refuse?

3h12014

kache@yacineMTB

Particularly, reinforcement learning for robotics, that anyone can do with a single GPU at home. It's going to be CUDA only (thanks jensen..). I am using 3x 4090s, but each training baseline will run on a single 4090.

0 python will be used

6h80112

kache@yacineMTB

I'm going to base my work on top of pufferlib and follow along their development. They are scarily good at making environments for all kinds of tasks - I've done remarkable things with their RL loop. They're taking contracts if you think you have a problem they can help with

6h3498

lycaon@lyc_aon

@yacineMTB this guy walkin dumb as hayl

5h363

e@yihyunCS

@yacineMTB @ChaseBrowe32432 active deception is the key issue here that people keep missing, and i have to wonder if they're missing the point on purpose

3h322

tooz@adarshsolanki

@yacineMTB bro lets move the families to shenzhen for a few years

4h941

kache@yacineMTB

@LouisLeLay4 Yes! I'm going to follow in homemadegarbage's foot steps and make home made garbage move. The goal is to give people a reproducible RL loop they can use

2h691

Louis Le Lay@LouisLeLay4

@yacineMTB Will you deploy on real robot afterwards?

2h651

Daniel Fredriksen 🇺🇲@runequantum

@yacineMTB Wouldn't a better sim randomize some elements of the physics parameters so that your model learns to adapt to differences in physics so when it gets to the real world it just works?

3h105

ICanAutomateThat@Jack1000k

@yacineMTB Convex decomposition sucks. But probably not going to be an issue for basic humanoid stuff. Maybe could approximate some collisions with spheres and make it go faster than convex shapes would.

2h211

someguy AI@NOTfunnyparanR

Race one vs a “ghost time”.

“ghost” = moving average of the average time per section of the race track. Reward is distance ahead of ghost applied for the time it sustains lead. Negative if behind.

Robustify it… Add uneven surfaces and barriers or track and penalize for how much it strays from center.

2h38

Omar@kouhxp

@yacineMTB you go to a restaurant order a ribeye, but chef swaps it for the cheapest cut because it's good for you, but still pay for a ribeye, and the kitchen has tinted windows

5h32

Chase Brower@ChaseBrowe32432

@yihyunCS @yacineMTB really? they could have easily done this without notifying anyone, and no would would have ever known. they mentioned that they're doing this in the spirit of honesty

3h21

Santi@__selewaut__

@yacineMTB You're cooking

6h51

Shamanic Depressive Gymnosophist (book in bio)@sigilante

@yacineMTB I love how when he falls only the feet have collision bounds.

6h312

Rajesh Poddar@rajeshpod

@yacineMTB If you only simulate the foot/ground contact, the sparsity structure of the constraint hessian, i.e JtDJ, is the same as the sparsity structure of the JSIM. As far as I'm aware none of the official mujoco implementations exploit this fact.

3h861

Harrison Kinsley@Sentdex

@yacineMTB I can already train locomotion very very fast 200K sps is fast but it changes nothong. If you could 10x the speed, maybe thats relevant. But even then, this isnt where help is needed.

Where we need help is in high contact physics envs.

1h781