15h ago

Zhihan Yang and seven co-authors submit RePlaid to arXiv, a continuous diffusion language model that matches scaling of state-of-the-art discrete diffusion models

72705313932.3K

——0——

Preprint dated 18 May 2026 shows no discretization steps are required.

Original post

#1366@JWTHICKSTUNOP

Zhihan Yang@ZHIHANYANG_

📢Excited to share our new paper: Continuous Diffusion Scales Competitively with Discrete Diffusion for Language We introduce RePlaid 🌊, a continuous diffusion language model (DLM) with 🏅Discrete likelihood bound 🏅Scaling laws competitive with SOTA discrete DLMs How? Dive in👇[🧵1/12] Paper: https://arxiv.org/abs/2605.18530 Work done with my amazing collaborators: @WeiGuo01 @ShuibaiZ69721 @ssahoo_ @YongxinChen1 @ArashVahdat @MardaniMorteza @jwthickstun

9:21 PM · May 18, 2026

Reposted by

#1822@LUCAAMB

QUOTE POST

#80Sander Dieleman@SEDIELEM

Did I mention continuous DLMs are back? I think I might have mentioned it before🤔

This one revisits Plaid (https://arxiv.org/abs/2305.18619), a continuous DLM trained with likelihood loss, and rigorously shows how it holds up against a recent discrete method. Pretty well, looks like!

Zhihan Yang@zhihanyang_

4:21 AM · May 19, 2026 · 19.6K Views

4:28 PM · May 19, 2026 · 2.8K Views

#359Tanishq Mathew Abraham, Ph.D.@ISCIENCELUVR

abs: https://arxiv.org/abs/2605.18530

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr

Continuous Diffusion Scales Competitively with Discrete Diffusion for Language "we establish the first scaling law for continuous DLMs that rivals discrete DLMs: RePlaid exhibits a compute gap of only 20× compared to autoregressive models, outperforms Duo while using fewer parameters, and outperforms MDLM in the over-trained regime."

10:48 AM · May 19, 2026 · 4.2K Views

10:48 AM · May 19, 2026 · 1K Views

ORIGINAL POST

#359Tanishq Mathew Abraham, Ph.D.@ISCIENCELUVR

Continuous Diffusion Scales Competitively with Discrete Diffusion for Language

"we establish the first scaling law for continuous DLMs that rivals discrete DLMs: RePlaid exhibits a compute gap of only 20× compared to autoregressive models, outperforms Duo while using fewer parameters, and outperforms MDLM in the over-trained regime."

10:48 AM · May 19, 2026 · 4.2K Views

QUOTE POST

#1822Luca Ambrogioni@LUCAAMB

Jet another chapter of the 'continuous language diffusion' workd story.

I'd say we should stop finding it surprising at this point

(spoiler, there was never a good reason to believe that continuous diffusion doesn't work, just grop think)

Zhihan Yang@zhihanyang_

4:21 AM · May 19, 2026 · 19.6K Views

8:33 AM · May 19, 2026 · 2.7K Views

Zhihan Yang and seven co-authors submit RePlaid to arXiv, a continuous diffusion language model that matches scaling of state-of-the-art discrete diffusion models

Sentiment

Cluster engagement