/Tech42d ago

NanoRollout introduces a lightweight open-source framework of 900 lines of core code for rollout-as-a-service supporting 4K-batch RL for SWE agents and over 250,000 coding trajectories

AI Judge changed title after evaluation, original title: "NanoRollout introduces a lightweight open-source framework with 900 lines of core code to accelerate digital agent rollouts via rollout-as-a-service and Miles integration"

The system integrates directly with Miles for scalable agent RL training.

172114613046.9K

#851

Original post

Banghua Zhu#1718

RadixArk@radixark

Slow, heavy environments have been the real bottleneck for agentic RL. NanoRollout tackles it head-on with a clean rollout-as-a-service design, integrated with miles for scalable agent RL. Great work from the team！

Junli Wang@JunliWang2021

Digital agent learning needs massive rollouts. But digital agent rollouts are painfully slow due to heavy environments. 🐌

🚀 We introduce NanoRollout, a lightweight open infra (900 lines core code) for digital agent rollout at scale, validated with three workloads:

🏋️ Large batchsize (4K) SWE Agent RL -> surpasses DeepSWE-32B 🧪 250k+ distilled coding trajectories -> SOTA ≤32B open coding agent ⚡ Fast evaluation on coding/cua/unified agent -> finish

Check our Blog: https://cocoa-org.notion.site/nanorollout

8:44 PM · May 18, 2026 · 13.1K Views

Sentiment

Users express gratitude to collaborators for making NanoRollout's lightweight open infrastructure for scalable agent rollouts possible.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Related links

nanorollout

COCOA-ORG.NOTION.SITEVia

Posts from X

Most Activity

VIEWS2.7KLIKES18

Hao Zhang@haozhangml

🎉🎉🥳🥳

RadixArk@radixark

42d2.7K182

BOOKMARKS3

Junli Wang@JunliWang2021

[6/N]

Finally, huge thanks to the amazing collaborators who made NanoRollout possible. Co-led with @ChengZhoujun. Grateful to @Yuxuan_Zhang13, @Ber18791531, @tyao923 for awesome collaboration.

And special thanks to @haozhangml and @rajammanabrolu for the support all the time.

📝 Blog: https://cocoa-org.notion.site/nanorollout💻 Code: https://github.com/cocoa-org/NanoRollout 🤗 Mocha collection: https://huggingface.co/collections/cocoa-org/nanorollout

46d38073

RETWEETS35

Junli Wang@JunliWang2021

Digital agent learning needs massive rollouts. But digital agent rollouts are painfully slow due to heavy environments. 🐌

🚀 We introduce NanoRollout, a lightweight open infra (900 lines core code) for digital agent rollout at scale, validated with three workloads:

🏋️ Large batchsize (4K) SWE Agent RL -> surpasses DeepSWE-32B 🧪 250k+ distilled coding trajectories -> SOTA ≤32B open coding agent ⚡ Fast evaluation on coding/cua/unified agent -> finish

Check our Blog: https://cocoa-org.notion.site/nanorollout

46d31.2K12898

REPLIES1

Junli Wang@JunliWang2021

[4/N] NanoRollout is a factory for trajectory distillation

We built 250K+ coding-agent trajectories across teacher models and harnesses.

📈The scaling curve is clean: more high-quality trajectory tokens -> stronger coding agents.

Our <=32B model reaches SOTA performance on SWE-Bench Verified and is on par with Qwen3-Coder-480B-A35B!

46d3367

RadixArk@radixark

Junli Wang@JunliWang2021

Digital agent learning needs massive rollouts. But digital agent rollouts are painfully slow due to heavy environments. 🐌

🚀 We introduce NanoRollout, a lightweight open infra (900 lines core code) for digital agent rollout at scale, validated with three workloads:

🏋️ Large batchsize (4K) SWE Agent RL -> surpasses DeepSWE-32B 🧪 250k+ distilled coding trajectories -> SOTA ≤32B open coding agent ⚡ Fast evaluation on coding/cua/unified agent -> finish

Check our Blog: https://cocoa-org.notion.site/nanorollout

42d13.1K6530

Junli Wang@JunliWang2021

[2/N] NanoRollout turns agent rollout into a service

It cleanly decouples three layers behind one unified rollout server:

🤖 agent harnesses 🌍 environment runtimes ⚙️ execution backends

The interface is simple:

/run -> trajectory

Give it tasks + a model endpoint. Get back trajectories, metadata, status, and rewards. RL, distillation, and evaluation pipelines all share the same rollout layer.

46d8497

Junli Wang@JunliWang2021

[3/N] Why does this matter for agentic RL

NanoRollout plugs into existing trainers with minimal rewrites.

🔌 We integrated it with miles, veRL, and tunix out of the box

🏆 By moving agent execution off the trainer side, we scale rollout batches to 4K and outperform strong baselines ProRL-Agent and DeepSWE.

46d4807

Junli Wang@JunliWang2021

[5/N] NanoRollout enables fast agent evaluation

NanoRollout pushes benchmark eval until the bottleneck moves off environment orchestration entirely.

⚡ Across SWE-Bench Verified, Terminal-Bench, CocoaBench, and OSWorld, scaling environment parallelism delivers massive wall-clock speedups.

Concrete example, 500 parallel SWE-Bench environments: 🐢 102 min -> 🐇 18 min (5.7× faster)

46d3787

Nick@Reddot_18

@radixark How to use it

42d3