2d ago

OpenBMB Ships 9B MiniCPM-o 4.5 For Full-Duplex Real-Time AI

0
Original post

Just a few days back, Thinking Machines Lab (TML), showcased a way of making AI interaction continuous instead of turn-based, a Full-Duplex Time-aligned micro-turn. It's a preview of the future of a near-realtime AI voice and video conversation with new 'interaction models' And MiniCPM-o 4.5 already shipped the same core idea through OpenBMB’s Omni-Flow framework: time-aligned perception and response instead of old turn-based chat. A 9B Full-Duplex omnimodal model that can see, hear, and speak at the same time. Omni-Flow also treats interaction as a continuous stream on a shared temporal axis, aligning visual input, audio input, and output speech/text into time chunks so the model can perceive while responding. That breaks the old walkie-talkie UX of AI: user talks, model waits, model replies. And this is not just a demo concept. It is a 9B open model with code, weights, a report, and edge deployment under 12GB RAM. It also surpasses Qwen3-Omni-30B-A3B in omni-modal capabilities and speech generation quality. This feels like the interaction layer AI was missing. OpenBMB already shipped this as a real Full-Duplex omni-modal architecture, with video tokens, audio tokens, LLM hidden states, speech tokens, and waveform generation all synced to one shared timeline.

11:28 AM · May 17, 2026 View on X
Reposted by