This week, I solved a problem in RL involving ludicrous sparsity that I have been thinking about since 2018. Initial sweeps are showing SOTA on one of our most consistently informative test envs. Blog post soon. For now, you can follow the dev live on stream
Joseph Suarez, MIT PhD and PufferAI founder, solves a reinforcement learning problem involving extreme sparsity with state-of-the-art results on a key evaluation environment
A blog post with further details is forthcoming.
Users praise the builder for solving the long-standing RL sparsity problem with SOTA results, calling the achievement amazing and inspirational.
No Digg Deeper questions have been answered for this story yet.
Most Activity
@jsuarez inspirational
This week, I solved a problem in RL involving ludicrous sparsity that I have been thinking about since 2018. Initial sweeps are showing SOTA on one of our most consistently informative test envs. Blog post soon. For now, you can follow the dev live on stream
This week, I solved a problem in RL involving ludicrous sparsity that I have been thinking about since 2018. Initial sweeps are showing SOTA on one of our most consistently informative test envs. Blog post soon. For now, you can follow the dev live on stream

@jsuarez

Hi Joseph, so that’s amazing that you were trying to solve the problem and got such results. Actually I have an idea so what I’m thinking that I want to create pentesting environment for capture the flag type of work. For example, you have machine that is vulnerable and I am thinking we could do some long horizon GRPO task with agents and tool use and gain the root problem can be formed. Any ideas?

@jsuarez Enjoyed a few of the sessions nice work.