Microsoft Research introduces Next-Latent Prediction, training transformers on internal latent states to achieve up to 3.3x faster inference · Digg