2h ago

Arthur Gretton posts an analysis questioning whether drifting models serve as fixed points for the Wasserstein gradient flow on KL divergence or qualify as such flows at all

The post examines the Generative Modeling via Drifting framework from Deng et al.

0
Original post

Your drifting model is secretly a fixed point for the Wasserstein gradient flow on... ...the KL? ...an approximation to the Sinkhorn? ...Is it even a Wasserstein gradient flow at all? https://arxiv.org/abs/2605.05118 @liwenliang @agalashov @JamesTThorn @ValentinDeBort1 @ArnaudDoucet1

8:50 AM · May 22, 2026 View on X

@ArthurGretton @liwenliang @agalashov @JamesTThorn @ValentinDeBort1 @ArnaudDoucet1 I actually tried the WGF method for KL with score estimators back in 2017 - didn’t know WGF methods, so I called it “KL-SGD” 😅 https://bayesiandeeplearning.org/2017/papers/39.pdf

Lesson for me re drifting model: many kernel based generative models can work if using kernels defined on pretrained features

Arthur GrettonArthur Gretton@ArthurGretton

Your drifting model is secretly a fixed point for the Wasserstein gradient flow on... ...the KL? ...an approximation to the Sinkhorn? ...Is it even a Wasserstein gradient flow at all? https://arxiv.org/abs/2605.05118 @liwenliang @agalashov @JamesTThorn @ValentinDeBort1 @ArnaudDoucet1

3:50 PM · May 22, 2026 · 4.4K Views
5:35 PM · May 22, 2026 · 76 Views