/Tech13h ago

SGLang, led by RadixArk's Banghua Zhu, adds Day-0 support for Poolside's 225B Laguna M.1 model

The 256-expert Mixture-of-Experts model targets long-horizon agentic coding.

130763.6K

#851

Original post

LMSYS Org@lmsysorg

🎉 Day-0 support for Laguna M.1 from @poolsideai is live in SGLang! This is a 225B MoE built for agentic coding & long-horizon work. 1️⃣ 70-layer MoE: 3 dense SwiGLU layers + 67 sparse MoE layers, 256 experts, top-k=16 with aux-loss-free load balancing 2️⃣ Global attention across all layers: 64 Q-heads, 8 KV-heads, softplus output gating 3️⃣ Native interleaved reasoning: thinking between tool calls, toggleable per-request 4️⃣ Strong on SWE-bench Verified, SWE-bench Multilingual, SWE-Bench Pro & Terminal-Bench 2.0.

Run it now with SGLang!

Poolside@poolsideai

Today we’re releasing the weights for Laguna M.1, our most capable model to date, with a 256K context length. Both base and post-trained checkpoints are now available on Hugging Face under Apache 2.0.