/Tech3h ago

LMSYS releases agent-assisted development for SGLang, boosting the LLM inference engine's throughput by 71 percent

It yields up to 2.75x speedups on B200 kernels

5493257K

#852

Original post

LMSYS Org@lmsysorg

🚀 New blog: Agent-Assisted SGLang Development, the story of how we turn benchmarking, profiling, and kernel optimization know-how into executable agent skills.

Agent-assisted workflows are saving our team massive engineering hours while delivering major gains across the stack: ⚡️ +71.4% throughput & TTFT 456→168ms for Qwen3-Next via allreduce fusion ⚡️ 29–49% TTFT reduction on long-context prompts via router tokenization deduplication ⚡️ Up to 2.32x diffusion denoising speedup via Spectral Progressive Diffusion ⚡️ 10 B200 kernel tasks at 1.13x–2.75x speedups via KDA-Pilot; 3 PRs merged upstream ⚡️ 1.41x faster LTX-2 VAE decode, saving 9.7 GiB peak memory

And rigor is built into every step: benchmarks are fixed before any patching, baseline and candidate share the same ABI, and every change must be backed by profile evidence, eliminating benchmark reward hacking. Each iteration passes a Humanize/RLCR review loop before proceeding.

Read the full blog to see how we're rethinking development workflow 👇

10:01 AM · Jul 2, 2026 · 4.9K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS2.3KBOOKMARKS6LIKES19RETWEETS1

Ying Sheng@ying11231

On the way to automating open source project maintenance.

LMSYS Org@lmsysorg

🚀 New blog: Agent-Assisted SGLang Development, the story of how we turn benchmarking, profiling, and kernel optimization know-how into executable agent skills.

Read the full blog to see how we're rethinking development workflow 👇

2h2.3K196

LMSYS Org@lmsysorg

Read full blog: https://www.lmsys.org/blog/2026-07-02-agent-assisted-sglang-development

10h46271