/Tech1d ago

FlashAttention 4 Code Reverse-Engineered in GPU Mode Talk

5777434.2K

#927

Original post

Charles 🎉 Frye#927

levi@levidiamode

157/365 of GPU Programming

Another FlashAttention4 resource that's been really helpful for me is the talk @charles_irl gave last year on GPU Mode (basically the lecture version of We reverse-engineered Flash Attention 4 blog post which is awesome as well) about FA4's code and the evolution to FA4.

Really cool how the Modal team broke down the code before the paper release and made educated inferences about the forward pass.

Wish more people did deeper code dissections like this!

levi@levidiamode

156/365 of GPU Programming

Giving FlashAttention 4 a read today and trying to get a sense of the evolution of FlashAttention in its forward and backward passes over the four generations.

I've seen @tedzadouri's GPU Mode talk mentioned quite a few times recently and have to echo that it's such a good perspective into what the thought process was behind FA4 and the steps to get there. @marksaroufim also does a great job interleaving the talk with pointed questions that help uninitiated learners like me get a better grasp of the concepts.

Also want to highlight @drisspg's talk on FlexAttention which had the animation/visualization of the softmax/MMA pipelining in FA4.

8:52 AM · Jun 9, 2026 · 4.2K Views

/Tech1d ago

FlashAttention 4 Code Reverse-Engineered in GPU Mode Talk

5777434.2K

#927

Original post

Charles 🎉 Frye#927

levi@levidiamode

157/365 of GPU Programming

Really cool how the Modal team broke down the code before the paper release and made educated inferences about the forward pass.

Wish more people did deeper code dissections like this!

levi@levidiamode

156/365 of GPU Programming

Giving FlashAttention 4 a read today and trying to get a sense of the evolution of FlashAttention in its forward and backward passes over the four generations.

Also want to highlight @drisspg's talk on FlexAttention which had the animation/visualization of the softmax/MMA pipelining in FA4.

8:52 AM · Jun 9, 2026 · 4.2K Views

Sentiment

Users praised the Modal team for reverse-engineering FlashAttention 4 code before the paper release, calling the technical effort amazing and expressing gratitude for the service.

Pos

100.0%

Neg

0.0%

2 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS199REPLIES1

Charles 🎉 Frye@charles_irl

@levidiamode 🫡

1d19931

BOOKMARKS2LIKES4RETWEETS2

levi@levidiamode

@charles_irl - Link to talk: https://www.youtube.com/watch?v=ZIEq-WTquy4 - Link to blog post: https://modal.com/blog/reverse-engineer-flash-attention-4

1d14642

levi@levidiamode

@charles_irl thank you for your service🫡

1d28

ゆうま@yuumalow6fzn

@levidiamode @charles_irl FA4のコード深掘りしたら自分のroutingでp99が結構動くかも。reverse-engineerした人すげえわ

1d2