Latent Context Language Models compress context tokens up to 16x, cutting time-to-first-token by 8.8x on the RULER benchmark · Digg