/Tech7h ago

DiPOD framework stabilizes diffusion language model post-training, boosting Sudoku reasoning accuracy from 22% to 97%

Integration requires only a single line of code.

611814887.9K

#111

Original post

Haozhe Jiang@erichzjiang

Why aren’t Diffusion Language Model smart yet? Lacking stable post training is a major bottleneck!

Meet DiPOD: the tripod for diffusion model post-training.

DiPOD boosts accuracy across reasoning tasks, with Sudoku jumping from 22% to 97%, through a one-line code change.

🧵1/5

3:34 PM · Jun 16, 2026 · 7.9K Views

Sentiment

Users are excited about DiPOD's simple one-line change boosting diffusion language model reasoning accuracy, calling the research cool and exciting.

Pos

100.0%

Neg

0.0%

2 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS500BOOKMARKS4LIKES7RETWEETS1

Haozhe Jiang@erichzjiang

Paper: https://arxiv.org/abs/2606.13795 Code: https://github.com/Astro-Eric/DiPOD-release Blog: https://astro-eric.github.io/blogs/dipod/ This is an amazing collaboration with @HavenFeng, @pabbeel, @JiantaoJ, @akanazawa, @nhaghtal

🧵5/5

8h50074

REPLIES1

Haozhe Jiang@erichzjiang

DiPOD tackles double drift by interleaving the gradient steps with self-distillations. In implementation, this results in adding a regularization term to the original objective, and consistently improves on GSM8K, MATH500, Countdown, and Sudoku.

🧵3/5

8h2293

Haozhe Jiang@erichzjiang

DLM post-training is hard because log-likelihood is intractable, and people replace it with proxies. We identify the double drift issue: proxy and drift from log-likelihood, and gradient subsequently drifts from policy gradient.

🧵2/5

8h2543

Haozhe Jiang@erichzjiang

DiPOD takes a variational inference perspective, and provides a theoretical framework to analyze policy gradients algorithms for generative models. It could produce useful algorithms in other domains like robotics as well.

🧵4/5

8h2393

Haozhe Jiang@erichzjiang

@berkeley_ai

6h1791

Vik@vkalahas

@erichzjiang this is some cool research! diffusion models for language, more than just images

5h851

Vishnu Teja Kunde@sampleparticle

@erichzjiang Congratulations on this exciting work on RL for diffusion LLMs! Our recent paper (https://arxiv.org/pdf/2603.12554) also explores RL post-training for DLMs. Since we study similar benchmarks, it would be interesting to compare approaches and results. Looking forward to further progress!

3h21