BLASST wins Best Paper at MLSys 2026 for a drop-in training-free dynamic sparse attention mechanism that thresholds online softmax statistics to skip negligible blocks in long-context LLM inference
It targets self-attention compute and memory bottlenecks during inference.
——0——
QUOTE POST
#397finbarr@FINBARRTIMBERS
This is an elegant paper; hope to try it out soon.
1:42 PM · May 18, 2026 · 15.8K Views