2d ago

DriftXpress cuts training costs for drifting diffusion models

0

DriftXpress applies the Nyström approximation to a subsample of pre-computed landmarks drawn from real and attractor data. The technique trains drifting diffusion models with lower computational expense while preserving sample quality. Side-by-side CIFAR-10 experiments show earlier formation of recognizable objects, faster wall-clock convergence, and stronger FID reduction than prior drifting baselines that rely on mini-batch summaries.

Original post

[1/6] Diffusion models are slow at inference. Drifting Models fix that but then training becomes the bottleneck. We asked: Is it possible to slash the training cost of drifting models without sacrificing quality? Our answer: DriftXpress. 🧵

7:45 AM · May 14, 2026 View on X
Reposted by

Ali did some amazing work on the hottest new generative model: drifting models (from Mingyang Deng @Goodeat258 et al., out of Kaiming He's group).

Speeds up training a lot using low rank Nyström approximation. Check out Ali's full thread. Paper and code available!

Ali FalahatiAli Falahati@Ali__Falahati

[1/6] Diffusion models are slow at inference. Drifting Models fix that but then training becomes the bottleneck. We asked: Is it possible to slash the training cost of drifting models without sacrificing quality? Our answer: DriftXpress. 🧵

2:45 PM · May 14, 2026 · 15.4K Views
2:55 PM · May 14, 2026 · 10.1K Views

> these summaries cover the entire training set oh cool so theoretically the summaries can be a non-parametric generator? and DriftXpress kinda distills from it?

3:55 PM · May 14, 2026 · 2.7K Views
DriftXpress cuts training costs for drifting diffusion models · Digg