/Tech38d ago

Joseph Suarez, MIT PhD and PufferAI founder, solves a reinforcement learning problem involving extreme sparsity with state-of-the-art results on a key evaluation environment

A blog post with further details is forthcoming.

10353147917.4K

#403

Original post

Joseph Suarez 🐡@jsuarez#1687inTech

This week, I solved a problem in RL involving ludicrous sparsity that I have been thinking about since 2018. Initial sweeps are showing SOTA on one of our most consistently informative test envs. Blog post soon. For now, you can follow the dev live on stream

7:17 AM · May 22, 2026 · 16K Views

Sentiment

Users praise the builder for solving the long-standing RL sparsity problem with SOTA results, calling the achievement amazing and inspirational.

Pos

100.0%

Neg

0.0%

3 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS1.4KBOOKMARKS1LIKES17

kache@yacineMTB

@jsuarez inspirational

Joseph Suarez 🐡@jsuarez

38d1.4K171

RETWEETS12

Joseph Suarez 🐡@jsuarez

38d16K33678

Frank O'Connor@HusbandOfAyn

@jsuarez

38d462

Yuvraj Singh (smolhub.com)@YuvrajS9886

Hi Joseph, so that’s amazing that you were trying to solve the problem and got such results. Actually I have an idea so what I’m thinking that I want to create pentesting environment for capture the flag type of work. For example, you have machine that is vulnerable and I am thinking we could do some long horizon GRPO task with agents and tool use and gain the root problem can be formed. Any ideas?

38d73

Eric@Ex0byt

@jsuarez Enjoyed a few of the sessions nice work.

38d69