5h ago

Jaskirat Singh introduces RAEv2 as a simplified update to Representation Autoencoders delivering over 10 times faster convergence with improved reconstruction and generation quality

Original framework appeared last October; tests covered text-to-image and world models.

0
Original post

In Oct last year, Representation Autoencoders provided an elegant solution to unified tokenization for understanding and generation. Today we make them a bit more simple. a bit more general. Result: >10x faster convergence, better reconstruction, better generation. And yes we test them on T2I and world models :) Introducing RAEv2

2:04 PM · May 21, 2026 View on X

check out RAEv2 led by Jas. through extensive exps, we found some really intriguing behaviors showing why strong representation encoders are key for pixel decoders. spoiler: it’s not about hillclimbing fid; new metrics like ep@fid-k/fdr^k show there’s a lot more left to explore!

Jaskirat SinghJaskirat Singh@1jaskiratsingh

In Oct last year, Representation Autoencoders provided an elegant solution to unified tokenization for understanding and generation. Today we make them a bit more simple. a bit more general. Result: >10x faster convergence, better reconstruction, better generation. And yes we test them on T2I and world models :) Introducing RAEv2

9:04 PM · May 21, 2026 · 505.4K Views
10:52 PM · May 21, 2026 · 9.1K Views
Jaskirat Singh introduces RAEv2 as a simplified update to Representation Autoencoders delivering over 10 times faster convergence with improved reconstruction and generation quality · Digg