AsymFlow achieves 1.57 FID on ImageNet in pixel space
Hansheng Chen introduces AsymFlow, a flow-based image generation method that retains velocity information in a low-rank subspace instead of using JiT x0-prediction. The model operates directly in pixel space without a VAE and records an FID of 1.57 on ImageNet, the highest among published pixel-space flow models. When used to finetune FLUX.2 klein, the resulting checkpoint outperforms the base model on HPSv3, DPG, and GenEval, ranking first overall on HPSv3 while delivering sharper textures and roughly 40 percent faster inference. The work was shared by researchers including Kosta Derpanis at York University.
Cool work by my former lab at Stanford on pixel-space image diffusion!
New paper: AsymFlow🔥 JiT x0-prediction is not enough for pixel generation. Better keep velocity in a low-rank subspace: - 1.57 FID on ImageNet (best pixel flow model) - Finetunes FLUX.2 klein into pixel space, beats the original on HPSv3/DPG/GenEval (#1 overall on HPSv3) 1/7