3d ago

Paper introduces flow language models on hyperspheres

0

The paper Language Modeling with Hyperspherical Flows presents flow language models that rotate token embeddings on the unit hypersphere instead of adding Gaussian noise. Training draws uniform noise on the sphere, applies SLERP interpolation, and optimizes cross-entropy loss on the posterior. Sampling runs Euler steps along tangent vectors for N-1 iterations. The approach targets the directional structure of discrete text embeddings.

Original post

🔥 New paper: Language Modeling with Hyperspherical Flows Recent flow language models (FLMs) all use Gaussian noise. Makes sense for images, but not necessarily for text 🫠 We propose to add noise by rotating embeddings on 𝕊^{d−1} instead 🌐 w/ @caglarml (1/9)

3:18 PM · May 13, 2026 View on X
Reposted by

Today in continuous diffusion language models, we have: - Spherical flows https://arxiv.org/abs/2605.05629 - Hyperspherical flows https://arxiv.org/abs/2605.11125

Another case of convergent evolution! Two different takes on the same core idea, published within days of each other.

Justin DeschenauxJustin Deschenaux@jdeschena

🔥 New paper: Language Modeling with Hyperspherical Flows Recent flow language models (FLMs) all use Gaussian noise. Makes sense for images, but not necessarily for text 🫠 We propose to add noise by rotating embeddings on 𝕊^{d−1} instead 🌐 w/ @caglarml (1/9)

10:18 PM · May 13, 2026 · 47.7K Views
10:50 AM · May 14, 2026 · 25K Views

Continuous diffusion/flow models have been very successful for image generation but for language they are still in its early days, and this work pushes the area in an important direction.

Our key insight in this paper: use the geometry of embeddings, rather than borrowing Gaussian corruption from images to inject noise.

Very proud to have supervised and collaborated on this project with @jdeschena. Great execution in a very short amount of time 👏

Justin DeschenauxJustin Deschenaux@jdeschena

🔥 New paper: Language Modeling with Hyperspherical Flows Recent flow language models (FLMs) all use Gaussian noise. Makes sense for images, but not necessarily for text 🫠 We propose to add noise by rotating embeddings on 𝕊^{d−1} instead 🌐 w/ @caglarml (1/9)

10:18 PM · May 13, 2026 · 47.7K Views
5:13 PM · May 14, 2026 · 2.6K Views

Also @sedielem, @ziyuwang and @NandoDF you might be interested in this work.

Caglar GulcehreCaglar Gulcehre@caglarml

Continuous diffusion/flow models have been very successful for image generation but for language they are still in its early days, and this work pushes the area in an important direction. Our key insight in this paper: use the geometry of embeddings, rather than borrowing Gaussian corruption from images to inject noise. Very proud to have supervised and collaborated on this project with @jdeschena. Great execution in a very short amount of time 👏

5:13 PM · May 14, 2026 · 2.6K Views
5:15 PM · May 14, 2026 · 145 Views

@sedielem We really need to work on better benchmarking now

Sander DielemanSander Dieleman@sedielem

Today in continuous diffusion language models, we have: - Spherical flows https://arxiv.org/abs/2605.05629 - Hyperspherical flows https://arxiv.org/abs/2605.11125 Another case of convergent evolution! Two different takes on the same core idea, published within days of each other.

10:50 AM · May 14, 2026 · 25K Views
11:01 AM · May 14, 2026 · 412 Views
Paper introduces flow language models on hyperspheres · Digg