/AI1d ago

CMU's Beidi Chen and Infini-AI Lab release AstraFlow v0.1.1, decoupling RL rollout from training with recursive agents

The release adds Megatron-LM and SGLang rollout infrastructure.

4105105019K

#419

Original post

Haizhong Zheng@haizhong_zheng

Happy to see more RL systems moving toward this deployment shape.

This has been one of the core ideas behind AstraFlow since our early design: large-scale agentic RL should move beyond trainer-centered “engine mode” and toward independently managed rollout/inference and training systems, connected by a clean rollout + weight-sync contract.

In AstraFlow (https://github.com/Infini-AI-Lab/astraflow), we have been building toward this direction through rollout/trainer service decoupling, bring-your-own rollout service, flexible dataflow, and support for heterogeneous rollout backends.

Excited to see the broader community converging on this architecture. I believe this is where large-scale agentic RL infrastructure is heading.

8:20 AM · Jun 5, 2026 · 7.7K Views

/AI1d ago

CMU's Beidi Chen and Infini-AI Lab release AstraFlow v0.1.1, decoupling RL rollout from training with recursive agents

The release adds Megatron-LM and SGLang rollout infrastructure.

4105105019K

#419

Original post

Haizhong Zheng@haizhong_zheng

Happy to see more RL systems moving toward this deployment shape.

Excited to see the broader community converging on this architecture. I believe this is where large-scale agentic RL infrastructure is heading.

8:20 AM · Jun 5, 2026 · 7.7K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS6.5KBOOKMARKS14LIKES32RETWEETS6

Infini-AI-Lab@InfiniAILab

🚀 AstraFlow v0.1.1 is out!

New in this release: • Dynamic recursive agents RL recipe • Megatron training backend

Inspired by dynamic agent workflows like @claudeai, this release reproduces the recursive agent RL recipe, where models learn to automatically spawn sub-agents to solve subtasks. (The implementation is based on the awesome Recursive Agent Optimization paper by @apurvasgandhi, @gneubig, @aviral_kumar2: https://arxiv.org/abs/2605.06639)

With existing RaaS support via @sgl_project and training backends including FSDP and Megatron, AstraFlow is moving toward a more flexible stack for large-scale agentic RL.

⭐ Repo: https://github.com/Infini-AI-Lab/astraflow 📖 Dynamic agents recipe: https://infini-ai-lab.github.io/astraflow/docs/recipes/textcraft-recursive.html

23h6.5K3214

REPLIES2

Beidi Chen@BeidiChen

Come and check out AstraFlow's new release~ While recursive agents are becoming one of the most exciting directions in agentic AI, we include this RL recipe in AstraFlow v0.1.1 showing how powerful dynamic sub-agent workflows can be in real-world coding tasks.

Infini-AI-Lab@InfiniAILab

🚀 AstraFlow v0.1.1 is out!

New in this release: • Dynamic recursive agents RL recipe • Megatron training backend

With existing RaaS support via @sgl_project and training backends including FSDP and Megatron, AstraFlow is moving toward a more flexible stack for large-scale agentic RL.

⭐ Repo: https://github.com/Infini-AI-Lab/astraflow 📖 Dynamic agents recipe: https://infini-ai-lab.github.io/astraflow/docs/recipes/textcraft-recursive.html

23h5.5K248

Matt@m13v_

@BeidiChen recursive sub-agent workflows are a real direction, not a demo trick. the seam is persistence: dynamic orchestration shines on a fresh run but loses the thread the moment a long session auto-compacts or the process restarts mid-task. https://fazm.ai/r/c5fp8ygt written with ai

21h20

AiDevCraft@AiDevCraft

@BeidiChen The interesting bit is whether the RL signal rewards the parent for *not* spawning when the context fits — most dynamic agent setups I've seen overspawn because the reward shape rewards latency/quality but ignores sub-agent token cost.

18h9