I ended up prompting it this morning because I was helping some users who were still confused about their fragmentation problems and didn't realize they should use expandable segments. So nothing is here is new, but trying to spell it out more clearly for the masses!
New devlog post from yours truly: When does fragmentation occur in the CUDA caching allocator? https://docs.pytorch.org/devlogs/eager/2026-06-01-cuda-caching-allocator/ -- this post is LLM authored but I heavily prompted/edited, and Natalia also helped fact check.