10h ago

DashAttention Advances Adaptive Sparse Hierarchical Attention In LLMs

0
Original post

Goodbye top-k in hierarchical attention! We devised DashAttention, which is adaptively sparse (compute is allocated based on the information structure of the query) and end-to-end differentiable. DashAttention pushes the accuracy–efficieny frontier over NSA and InfLLMv2!

8:52 AM · May 21, 2026 View on X