Huge congrats to the Microsoft AI team on MAI-Thinking-1.
Great to see large-scale RL systems converging around the SGLang + Ray ecosystem. Rocket’s design—async RL, separated rollout / inference / learner pools, router-based traffic control, prefix caching, and fault-tolerant inference—is very aligned with what we believe in slime: RL is not just an algorithm problem, but a full-stack infrastructure problem.
Excited to see more open RL infra ideas validated at frontier scale!
Huge milestone for the Microsoft AI team: seven frontier MAI models, led by MAI-Thinking-1. Proud that SGLang powered the RL inference stack behind it. Their Rocket framework runs SGLang and the SGLang router for load balancing, traffic control, prefix caching, and graceful failure recovery across thousands of inference chips.
Congrats to the team @MicrosoftAI 👏
Read more on how SGLang powers the stack: https://microsoft.ai/wp-content/uploads/2026/06/main_20260602_2.pdf