MiniMax Contributes Parallel Models To Gradbot For Instant Voice Responses

Original post

Reasoning LLMs typically take 2-3 seconds to start emitting tokens. In a voice agent, that's 2-3 seconds of silence after the user finishes speaking.

The @MiniMax_AI team just shipped a community contribution to Gradbot with two models running in parallel. MiniMax-M2-her produces a short acknowledgement that starts streaming to TTS immediately, while MiniMax-M2.7 runs in the background reasoning and tool calls.

Thanks to @davidtaoweiji for this contribution. Checkout our readme for more details. https://github.com/gradium-ai/gradbot

6:31 AM · Jun 5, 2026 · 4.1K Views