Junli Wang releases NanoRollout for agent rollouts
Junli Wang introduced NanoRollout, a lightweight open infrastructure of 900 lines of core code. It removes simulation bottlenecks to enable parallel execution of digital agent rollouts across thousands of instances. Tests on three workloads included reinforcement learning for SWE agents at batch size 4,000 that surpassed DeepSWE-32B, plus distillation of more than 250,000 coding trajectories into a state-of-the-art open coding agent at or below 32B parameters. The project blog contains the code and details.
The lack of light weight, open agent infra has been a massive pain point. This is a great starting point esp for large scale, thousands of parallel envs, battle tested coding / computer use agent training!
Digital agent learning needs massive rollouts. But digital agent rollouts are painfully slow due to heavy environments. 🐌 🚀 We introduce NanoRollout, a lightweight open infra (900 lines core code) for digital agent rollout at scale, validated with three workloads: 🏋️ Large batchsize (4K) SWE Agent RL -> surpasses DeepSWE-32B 🧪 250k+ distilled coding trajectories -> SOTA ≤32B open coding agent ⚡ Fast evaluation on coding/cua/unified agent -> finish Check our Blog: https://cocoa-org.notion.site/nanorollout
check our new work nanorollout!!
Digital agent learning needs massive rollouts. But digital agent rollouts are painfully slow due to heavy environments. 🐌 🚀 We introduce NanoRollout, a lightweight open infra (900 lines core code) for digital agent rollout at scale, validated with three workloads: 🏋️ Large batchsize (4K) SWE Agent RL -> surpasses DeepSWE-32B 🧪 250k+ distilled coding trajectories -> SOTA ≤32B open coding agent ⚡ Fast evaluation on coding/cua/unified agent -> finish Check our Blog: https://cocoa-org.notion.site/nanorollout
Very solid agentic infra work on accelerating agent rollout!
Digital agent learning needs massive rollouts. But digital agent rollouts are painfully slow due to heavy environments. 🐌 🚀 We introduce NanoRollout, a lightweight open infra (900 lines core code) for digital agent rollout at scale, validated with three workloads: 🏋️ Large batchsize (4K) SWE Agent RL -> surpasses DeepSWE-32B 🧪 250k+ distilled coding trajectories -> SOTA ≤32B open coding agent ⚡ Fast evaluation on coding/cua/unified agent -> finish Check our Blog: https://cocoa-org.notion.site/nanorollout