Google DeepMind's Mathieu Blondel argues autoregressive models outperform diffusion for discrete sequences due to token-wise softmax independence limitations · Digg