there's like 3 or 4 distinct scaling axes in RL that nobody talks about for some reason
Prime Intellect's Will Brown argues three or four scaling axes in reinforcement learning are currently overlooked
Founder Danielle Fong suggests the actual number is even higher
No Digg Deeper questions have been answered for this story yet.
Most Activity
i'll give one meta-framing which can be unpacked into several
we often want to simultaneously optimize two competing objectives, e.g. performance and reasoning efficiency, or task volume and generalization-per-task
these can all be minmax problems with lagrangian formulations
there's like 3 or 4 distinct scaling axes in RL that nobody talks about for some reason
@willccbb maybe more
there's like 3 or 4 distinct scaling axes in RL that nobody talks about for some reason
@willccbb Not exactly the same (i think?) but i've been thinking about the predictability vs diversity tension for test time scaling, lately.
there's like 3 or 4 distinct scaling axes in RL that nobody talks about for some reason

@willccbb Chat which