/Tech4h ago

Developer Adds Asymmetric Actor Critic To Pufferlib For Dingbotics

51036478.7K

Original post

kache@yacineMTB#403inTech

So next set of improvements for dingbotics: I need to update pufferlib w/ asymmetric actor critic - giving the critic privileged state so it can estimate returns better, which helps it beat up the actor (what actually gets deployed) in the right direction, better

9:19 AM · Jun 16, 2026 · 6K Views

Sentiment

Positive users praise adding asymmetric actor critic to Pufferlib for Dingbotics as the right approach, while negative users call it unnecessary complexity and a waste of time.

Pos

66.7%

Neg

33.3%

6 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS2.4KBOOKMARKS2LIKES13

kache@yacineMTB

This is something that I learned from reading the mujoco playground baseline. Their network sees more data than mine because they use this; so in a way, it is unfair (even though I'm beating their baseline)

So let's make it a little more fair!

4h2.4K132

REPLIES2

kache@yacineMTB

after some experiments and looking at charts, this is a complete waste of time and barely helps

kache@yacineMTB

So let's make it a little more fair!

2h1.8K100

John Owen@dreamingElvis

@yacineMTB 2017, wow. https://arxiv.org/abs/1710.06542

4h5211

kache@yacineMTB

the critic already learns so well and so quickly that asymmetric actor / critic is very likely not wroth the complexity at all

kache@yacineMTB

after some experiments and looking at charts, this is a complete waste of time and barely helps

2h94260

kache@yacineMTB

@BurstOfEntropy dont worry. i read every line of code, nothing slips by me : )

4h481

B Forth@BurstOfEntropy

@yacineMTB this is nice! something probably needs privileged state. just be careful the llms don't give the actor some of that same sweet sweet privilege (they are so bad at programming)

4h54

🎯🔫👌@gurgle_io

@yacineMTB Interesting to think of this as a form of distillation or delegation. The general trend should be inference <<<<<<< training. This is the only way to explain the performance of biological systems. It would be awesome if it somehow became true that your ancestors are watching.

4h121

B Forth@BurstOfEntropy

@yacineMTB doing it right

4h41