1d agoPoolside releases technical report on Laguna models, detailing expert collapse and its REINFORCE-style RL pipelineEthan Smith suggests integrating variance reduction methods like REBAR.