/Tech19h ago

CMU releases a free online book on GPU programming for machine learning systems, covering NVIDIA's Blackwell architecture

The curriculum covers attention, prefill, and fused MoE kernels

221.4K1951.8K111.1K

#196

Original post

Tianqi Chen@tqchenml

We taught a brand-new mini-series this year at @SCSatCMU on Modern GPU Programming for ML Systems, as part of the ML Systems course, touching on fun questions like what data layout swizzling is, how to use 3D TMA, and state-of-the-art Blackwell programming. We released a curated online book based on the materials: https://mlc.ai/modern-gpu-programming-for-mlsys/ check it out

4:30 AM · Jun 23, 2026 · 111.3K Views

Sentiment

Users are praising CMU's online book on modern GPU programming for ML systems because it offers practical techniques and bridges the gap between ML engineering and low-level systems knowledge.

Pos

100.0%

Neg

0.0%

15 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

MLC.AIVia

#196

Posts from X

Most Activity

VIEWS4.3KBOOKMARKS2LIKES20

Tianqi Chen@tqchenml

Thanks @modal for compute support for the course, and the amazing course staff to make it happen. Finally the effort is made possible by the open source TIRx compiler effort lead by @bohanhou1998 and many other collaborators.

1d4.3K202

RETWEETS195

Tianqi Chen@tqchenml

1d111.3K1.4K1.8K

REPLIES2

Tianqi Chen@tqchenml

@levidiamode @SCSatCMU We didn’t manage to have pub recording this time, hopefully the curated materials and interactive demos helps :)

1d1.9K11

levi@levidiamode

@tqchenml @SCSatCMU amazing work! any chance you're gonna publish the lecture recordings as well?

1d2.2K7

sshkhr@sshkhr16

@tqchenml @Soul0Engineer @SCSatCMU This is quite incredible thanks @tqchenml

1d710

pengcheng@niu_pc

@tqchenml @SCSatCMU awesome

1d679

Mohammed@mohamme57576867

@tqchenml @SCSatCMU sounds awesome, love the focus on practical gpu techniques

1d622

Manpreet Singh@manpree59181175

@tqchenml @SCSatCMU Wow , awesome :)

1d482

Hariharan @ CVPR 2026@bombrake

@tqchenml @SCSatCMU Beautiful

1d349

Astarag Mogapatra@Athekunal

@tqchenml @SCSatCMU Thanks @tqchenml Any complementary materials for this course?

1d307

Noam Salinger@noam_salin15821

The rare curriculum that teaches across the boundary. Most ML engineers treat the kernel as a black box, and most systems courses stop at the API. The engineers who can reason from swizzling up to the scheduler are exactly who you want when the workload shape shifts underneath you.

1d272

Amin@Ma_Msvi

@tqchenml @SCSatCMU AWESOME!

1d256

Manash@shiba14857

@tqchenml @algo_diver @SCSatCMU Thanks a lot for this 🙏🏻

1d233

Narendra@naren11200

@tqchenml @SCSatCMU thanks Prof. Chen I was looking for something like this it is very helpful...

18h120

Adel Bucetta@adelbucetta

@tqchenml @SCSatCMU that's awesome, but i think we're still missing the most important part: why are gpu programmers getting more attention in the ml space? what does it say about the state of our industry that we're shifting focus from algo dev to compute optimization?

1d115

LEI WANG@yiakwy2023

@tqchenml @SCSatCMU Thx Tianqi for sharing this. It is quite a coincidence that we are just working on a related issue : SymmGemm ( https://www.linkedin.com/posts/lei-wang-1722a28a_faster-symmul-with-thunderkittenspdf-share-7475402691364536322-TPX6/?utm_source=share&utm_medium=member_desktop&rcm=ACoAABLocGYBO0QGi8RFxdL6jUQf99aRtJxy15k ) . We develop some unique techniques to utilize NoC cluster multicast and L2 cache affinity in this question, and hope this a useful example.

17h111

saif abroad@ekdietcoke

@tqchenml @SCSatCMU thanks boss

1d79

Vipul Sharma@VipulS_1

@tqchenml @levidiamode @SCSatCMU I wish I could attend your seminars and lab meetings remotely. There's so much info relevant to my daily job that will become easy to just naturally acquire by listening in on those conversations.

1d42

fdelys@fdelys_

@tqchenml @SCSatCMU As a CMU student, what would prepare me the best for this type of GPU programming, ML Systems or DL systems?

14h41

bk@boulatbek

@tqchenml @levidiamode @SCSatCMU whyyyy?)

1d17