Well, the idea seems cool. This uses a coreset idea from approximation theory to "expresss" strong non-causal attention approximation into a causal, streaming one. https://arxiv.org/abs/2606.10944
5:46 AM · Jun 10, 2026 · 870 Views
It includes an efficient Triton implementation for practical deployment.
Well, the idea seems cool. This uses a coreset idea from approximation theory to "expresss" strong non-causal attention approximation into a causal, streaming one. https://arxiv.org/abs/2606.10944
It includes an efficient Triton implementation for practical deployment.
Well, the idea seems cool. This uses a coreset idea from approximation theory to "expresss" strong non-causal attention approximation into a causal, streaming one. https://arxiv.org/abs/2606.10944