MiniMax details its M3 sparse attention architecture, claiming a 15.6x decoding speedup at 1 million tokens · Digg