completely agree with this take that the hardest part of agentic RL is managing and scaling the environments, not the algorithm (and why I find it super interesting to be working on this since it touches so many areas at once)
got both the agentic RL and eval guides on the reading list 🤓
> https://cameronrwolfe.substack.com/p/agentic-rl > https://cameronrwolfe.substack.com/p/agent-evals
One of the hardest aspects of agentic RL is managing / scaling environments...
🧵 [1/6]