Users thank authors and praise Microsoft's Mirage for storing 3D scenes as latent tokens to deliver 10.57x faster generation and 55x lower memory use in video models.
Most Activity
Microsoft Research introduces Mirage
Latent spatial memory stores 3D scenes directly as latent tokens, skipping the costly RGB render-and-reencode loop. The result is up to 10.57x faster video generation, 55x lower memory use, and state-of-the-art consistency on WorldScore.

paper: https://huggingface.co/papers/2606.09828

Paper: https://paperswithcode.co/paper/2606.09828
Project: https://aka.ms/latent-spatial-memory/
Code: https://github.com/microsoft/LatentSpatialMemory

@_akhaliq Thanks @_akhaliq ! Author thread: Explore more at: https://aka.ms/latent-spatial-memory

@_akhaliq Real question is whether this scales beyond demos or hits the same wall as every other world model.

@HuggingPapers Mirage stores 3D scenes as latent tokens, skipping the costly RGB render-and-reencode loop. 10.57x faster generation, 55x lower memory, state-of-the-art consistency on WorldScore. #MicrosoftResearch

@_akhaliq Please check out the SO-101 benchmark I posted!

@Rangfeng1117 @_akhaliq You can try it after we release code and ckpt!