/AI7h ago

Parallax PLX Triton Kernel Cuts NanoGPT Latency 10-15% on H100

--0--
Quote posts
Comments
Reposts
Original postZhaoran Wang#865
Yifei Zuo@YifeiZuoX

I ran a quick benchmark in nanogpt this morning and here's the result on my H100 machine:

DynMuon Attn: 159.04ms DynMuon PLX: 187.86ms SOAP-H Attn: 958.63ms SOAPH PLX: 1054.16ms

Parallax shows a 10–15% latency drop using the PLX Triton kernel compared with SDPA, as used in the nanoGPT optimizer track. @Sam_Acqua

Yifei Zuo@YifeiZuoX

Very impressive results from Min Li and @Haoxiang__Wang: simply swapping Attention for Parallax reaches 2880 steps with the SOAP-H optimizer, beating the latest SOTA record on modded-nanogpt (@kellerjordan0) with no hyperparameter tuning.

A few observations: - Parallax is uniformly stronger than Softmax Attention across all records. - Optimizers don't transfer to Parallax with the same magnitude, which confirms the optimizer–architecture interaction from the Parallax paper. - The cleanest modifications often transfer best; records built on heavy tuning transfer less reliably.

These are preliminary results, I believe both the Parallax architecture and the optimizer side have room to improve. Code is open-sourced below, give it a try.

Code: https://github.com/Yifei-Zuo/modded-nanogpt-plx/tree/master/parallax Kernel: https://github.com/Yifei-Zuo/Parallax Paper: https://arxiv.org/abs/2605.29157

11:42 AM · Jun 3, 2026 · 554 Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
VIEWS62BOOKMARKS1LIKES3
Zhaoran Wang@zhaoran_wang

more latency drop, too!

Yifei Zuo@YifeiZuoX

I ran a quick benchmark in nanogpt this morning and here's the result on my H100 machine:

DynMuon Attn: 159.04ms DynMuon PLX: 187.86ms SOAP-H Attn: 958.63ms SOAPH PLX: 1054.16ms

Parallax shows a 10–15% latency drop using the PLX Triton kernel compared with SDPA, as used in the nanoGPT optimizer track. @Sam_Acqua

6hViews 62Likes 3Bookmarks 1