/AI10h ago

VLLM-Omni V0.22.0 Adds NVIDIA Cosmos 3 Support And Robot Serving

13295459420.9K

#1452

Original post

Ming-Yu Liu#1452

vLLM@vllm_project

🎉 Meet vLLM-Omni v0.22.0, a major upgrade for omnimodal world models and production-grade multimodal serving.

🌍 Day-0 @NVIDIAAI Cosmos 3 world models: text, image, audio, video, and action, in and out. 🤖 Robot serving: DreamZero + OpenPI realtime API. 🎙️ Production TTS: Qwen3-TTS, Qwen3-Omni, VoxCPM2 and more. 🎨 Faster image/video/diffusion: Wan 2.2, HunyuanVideo 1.5, LTX-2.3. ⚡ Broader quantization (FP8/INT8, MXFP4/MXFP8, W4A16, ModelOpt) and hardware coverage.

339 commits, 124 contributors, 52 of them new. Thank you all. 🙌

🔗 https://github.com/vllm-project/vllm-omni/releases/tag/v0.22.0

8:55 AM · Jun 8, 2026 · 20.9K Views

/AI10h ago

VLLM-Omni V0.22.0 Adds NVIDIA Cosmos 3 Support And Robot Serving

13295459420.9K

#1452

Original post

Ming-Yu Liu#1452

vLLM@vllm_project

🎉 Meet vLLM-Omni v0.22.0, a major upgrade for omnimodal world models and production-grade multimodal serving.

339 commits, 124 contributors, 52 of them new. Thank you all. 🙌

🔗 https://github.com/vllm-project/vllm-omni/releases/tag/v0.22.0

8:55 AM · Jun 8, 2026 · 20.9K Views

Sentiment

Users praise vLLM-Omni v0.22.0 for its robotics serving support like DreamZero plus OpenPI realtime API and for its insane-fast FP8 quantization performance.

Pos

100.0%

Neg

0.0%

2 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

阿納斯塔西婭@3___infinix

@vllm_project @NVIDIAAI vLLM 推 multimodal 越做越猛了，Cosmos 3 Day-0 支援滿翻天的。實際 latency 表現呢？

9h143

Prithvi Jadwani | AI SEO | GEO | REDDIT SEO | GMB@Prithvi_Jadwani

@vllm_project @NVIDIAAI 339 commits is the new funding round. What's the growth rate in actual users?

8h113

Oleksandr@dadadaistt

@vllm_project @NVIDIAAI omni serving insane fast fp8 quantization lit

5h50

learnbydoingwithsteven数能生智@Catchingtides

@vllm_project @NVIDIAAI Production-grade multimodal serving is becoming a distinct infrastructure layer from text-only inference.

5h33

Oracles Technologies LLC@OraclesTech

@vllm_project @NVIDIAAI Keep your LLMs & agents secure! https://github.com/OraclesTech/guardian-sdk

8h7

cc@ccb8128

@vllm_project @NVIDIAAI The robotics serving support (DreamZero + OpenPI realtime API) is the sleeper feature here. Everyone's focused on chat/vision, but real-time robot serving at production grade is where multimodal actually gets interesting. 52 new contributors in one release is wild too.