/AI5h ago

NVIDIA Researchers Release VoLo Orchestrator for Long-Horizon Robot Manipulation

2307112.4K

#737

Original post

Chris Paxton#737

Siyi Chen@ChenSiyich

Wonderful to be back from #CVPR2026, and excited to share the release of our follow-up work:

VoLo: A Physical Orchestrator for Open-Vocabulary Long-Horizon Manipulation

VoLo introduces the idea of a physical orchestrator for open-vocabulary, long-horizon manipulation. Our goal is to move toward robots that can reason, plan, act, monitor, and recover by adaptively using VLA/WAMs, vision models, and action primitives as tools.

We introduce three main contributions:

🤖 VoLoAgent — a physical orchestrator that plans, monitors, and recovers by adaptively using, halting, and redirecting robot actions with tools.

📊 RoboVoLo — a high-fidelity benchmark with 126 open-vocabulary long-horizon manipulation tasks spanning common sense, memory/state tracking, complex references, and world knowledge.

📈 A large-scale empirical study comparing action models, code-as-policy systems, TAMP-style systems, and ablations of the VoLoAgent orchestrator, complemented by real-robot experiments.

This work was done during my internship at @NVIDIA and would not have been possible without my brilliant collaborators: Hugo Hadfield, Alexander Zook, @mikacuy, @luke_ch_song, @erwincoumans, @xuningy, Faisal Ladhak, @qu_1006, @BirchfieldStan, Jonathan Tremblay, and @robovalts. Huge thanks to everyone!

🔗 Project: https://chicychen.github.io/VoLo/ 🔗 Previous work, SpaceTools: https://spacetools.github.io/

#Robotics #EmbodiedAI #VisionLanguageModels #VLAModels #RobotLearning #NVIDIA #CVPR2026 #LongHorizonManipulation #AI #ComputerVision

5:44 PM · Jun 9, 2026 · 2.4K Views

/AI5h ago

NVIDIA Researchers Release VoLo Orchestrator for Long-Horizon Robot Manipulation

2307112.4K

#737

Original post

Chris Paxton#737

Siyi Chen@ChenSiyich

Wonderful to be back from #CVPR2026, and excited to share the release of our follow-up work:

VoLo: A Physical Orchestrator for Open-Vocabulary Long-Horizon Manipulation

We introduce three main contributions:

🤖 VoLoAgent — a physical orchestrator that plans, monitors, and recovers by adaptively using, halting, and redirecting robot actions with tools.

📊 RoboVoLo — a high-fidelity benchmark with 126 open-vocabulary long-horizon manipulation tasks spanning common sense, memory/state tracking, complex references, and world knowledge.

📈 A large-scale empirical study comparing action models, code-as-policy systems, TAMP-style systems, and ablations of the VoLoAgent orchestrator, complemented by real-robot experiments.

🔗 Project: https://chicychen.github.io/VoLo/ 🔗 Previous work, SpaceTools: https://spacetools.github.io/

#Robotics #EmbodiedAI #VisionLanguageModels #VLAModels #RobotLearning #NVIDIA #CVPR2026 #LongHorizonManipulation #AI #ComputerVision

5:44 PM · Jun 9, 2026 · 2.4K Views

Sentiment

Positive users praise the VoLo Orchestrator release for open-vocabulary robot manipulation, calling it great work.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS397LIKES1

Jiafei Duan@DJiafei

@ChenSiyich Great work! Really like it

4h3971

Original post

Chris Paxton#737

Siyi Chen@ChenSiyich

Wonderful to be back from #CVPR2026, and excited to share the release of our follow-up work:

VoLo: A Physical Orchestrator for Open-Vocabulary Long-Horizon Manipulation

We introduce three main contributions:

🤖 VoLoAgent — a physical orchestrator that plans, monitors, and recovers by adaptively using, halting, and redirecting robot actions with tools.

📊 RoboVoLo — a high-fidelity benchmark with 126 open-vocabulary long-horizon manipulation tasks spanning common sense, memory/state tracking, complex references, and world knowledge.

📈 A large-scale empirical study comparing action models, code-as-policy systems, TAMP-style systems, and ablations of the VoLoAgent orchestrator, complemented by real-robot experiments.

🔗 Project: https://chicychen.github.io/VoLo/ 🔗 Previous work, SpaceTools: https://spacetools.github.io/

#Robotics #EmbodiedAI #VisionLanguageModels #VLAModels #RobotLearning #NVIDIA #CVPR2026 #LongHorizonManipulation #AI #ComputerVision

5:44 PM · Jun 9, 2026 · 2.4K Views