/AI6h ago

FlashAttention 4 Code Reverse-Engineered in GPU Mode Talk

5322161.6K
levi@levidiamode

157/365 of GPU Programming

Another FlashAttention4 resource that's been really helpful for me is the talk @charles_irl gave last year on GPU Mode (basically the lecture version of We reverse-engineered Flash Attention 4 blog post which is awesome as well) about FA4's code and the evolution to FA4.

Really cool how the Modal team broke down the code before the paper release and made educated inferences about the forward pass.

Wish more people did deeper code dissections like this!

8:52 AM · Jun 9, 2026 · 1.6K Views
Sentiment

Users praised the Modal team for reverse-engineering FlashAttention 4 code before the paper release, calling the technical effort amazing and expressing gratitude for the service.

Pos
100.0%
Neg
0.0%
2 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS199REPLIES1
Charles 🎉 Frye@charles_irl

@levidiamode 🫡

5hViews 199Likes 3Bookmarks 1
BOOKMARKS2LIKES4RETWEETS2
levi@levidiamode

@charles_irl - Link to talk: https://www.youtube.com/watch?v=ZIEq-WTquy4 - Link to blog post: https://modal.com/blog/reverse-engineer-flash-attention-4

6hViews 146Likes 4Bookmarks 2
levi@levidiamode

@charles_irl thank you for your service🫡

5hViews 28
ゆうま@yuumalow6fzn

@levidiamode @charles_irl FA4のコード深掘りしたら自分のroutingでp99が結構動くかも。reverse-engineerした人すげえわ

5hViews 2