/Tech21h ago

Modal Engineer Presents Stanford Talk On Transformer Inference Deployment

143172623018.8K
Original post
Charles 馃帀 Frye@charles_irl#927inTech

much gratitude to the many colleagues at @modal whose insights & work i shared here -- @_gongy, @_dcw02, @saatwiknagpal, @DevenNavani, @be4ncurd, @emilyhanyf, @hsubbaraj, @racerfunction, @luiscape, @jonobelotti_IO, @mma12261, @teenychairs & probably others <3

Tried to squeeze the most important bits about the entire stack for cloud deployment of transformer inference, from application layer concerns to hardware, debugging, and o11y, into one talk. Had to operate at a very high tok/s!

https://www.youtube.com/watch?v=ZUdIsRZhWXI

6:30 PM 路 Jun 9, 2026 路 1.1K Views
Sentiment

Users thank Modal colleagues for insights shared in the Stanford talk on full-stack cloud transformer inference and enthusiastically celebrate the technology.

Pos
100.0%
Neg
0.0%
2 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS14KBOOKMARKS218LIKES268RETWEETS24REPLIES9

Tried to squeeze the most important bits about the entire stack for cloud deployment of transformer inference, from application layer concerns to hardware, debugging, and o11y, into one talk. Had to operate at a very high tok/s!

https://www.youtube.com/watch?v=ZUdIsRZhWXI

21hViews 14KLikes 268Bookmarks 218

good day to think about owning yr intelligence btw

Tried to squeeze the most important bits about the entire stack for cloud deployment of transformer inference, from application layer concerns to hardware, debugging, and o11y, into one talk. Had to operate at a very high tok/s!

https://www.youtube.com/watch?v=ZUdIsRZhWXI

15hViews 2.5KLikes 28Bookmarks 8
Alex Mirran@alex_mirran

The business of Inference is so interesting and extremely difficult! This is an awesome talk from @charles_irl

Tried to squeeze the most important bits about the entire stack for cloud deployment of transformer inference, from application layer concerns to hardware, debugging, and o11y, into one talk. Had to operate at a very high tok/s!

https://www.youtube.com/watch?v=ZUdIsRZhWXI

20hViews 1.2KLikes 7Bookmarks 4

@modal @_gongy @_dcw02 @saatwiknagpal @DevenNavani @be4ncurd @emilyhanyf @hsubbaraj @racerfunction @luiscape @jonobelotti_IO @mma12261 @teenychairs @_dcw02 i apologize for screen-capping our hinge-free DMs and then sharing it with the whole world

21hViews 178Likes 2
David Wang@_dcw02

@charles_irl @modal @_gongy @saatwiknagpal @DevenNavani @be4ncurd @emilyhanyf @hsubbaraj @racerfunction @luiscape @jonobelotti_IO @mma12261 @teenychairs machine god go brrr

21hViews 29Likes 1

@_dcw02 @modal @_gongy @saatwiknagpal @DevenNavani @be4ncurd @emilyhanyf @hsubbaraj @racerfunction @luiscape @jonobelotti_IO @mma12261 @teenychairs in the grimdark future there is only brrr

21hViews 25Likes 2