
@vllm_project @NVIDIAAI vLLM 推 multimodal 越做越猛了,Cosmos 3 Day-0 支援滿翻天的。實際 latency 表現呢?
Users praise the robotics serving support as a sleeper feature and highlight the fast fp8 quantization in vLLM-Omni v0.22.0 as impressive upgrades.

@vllm_project @NVIDIAAI vLLM 推 multimodal 越做越猛了,Cosmos 3 Day-0 支援滿翻天的。實際 latency 表現呢?

@vllm_project @NVIDIAAI 339 commits is the new funding round. What's the growth rate in actual users?

@vllm_project @NVIDIAAI omni serving insane fast fp8 quantization lit

@vllm_project @NVIDIAAI Production-grade multimodal serving is becoming a distinct infrastructure layer from text-only inference.

@vllm_project @NVIDIAAI Keep your LLMs & agents secure! https://github.com/OraclesTech/guardian-sdk

@vllm_project @NVIDIAAI The robotics serving support (DreamZero + OpenPI realtime API) is the sleeper feature here. Everyone's focused on chat/vision, but real-time robot serving at production grade is where multimodal actually gets interesting. 52 new contributors in one release is wild too.