2d ago

A pseudonymous PyTorch engineer notes that the CUDA caching allocator produces hard-to-diagnose bugs like stale data and allocation failures in NVIDIA GPU code

Edward Z. Yang replies inquiring about fragmentation or stream issues.

435012.5K

——0——

Original post

#233typedfemale@TYPEDFEMALE

the CUDA caching allocator is such a great way to create extremely "interesting" bugs for yourself

1:55 PM · May 17, 2026

#233typedfemale@TYPEDFEMALE

and add expandable segments if you want a real challenge

typedfemale@typedfemale

the CUDA caching allocator is such a great way to create extremely "interesting" bugs for yourself

8:55 PM · May 17, 2026 · 10.9K Views

9:00 PM · May 17, 2026 · 1.9K Views

#233typedfemale@TYPEDFEMALE

@ezyang we have a kernel that's corrupting memory between the forward and backward pass and i think caching allocator was making it non-deterministic (really not it's fault, i was just being stupid and didn't realize what was going on)

Edward Z. Yang@ezyang

@typedfemale What kinds of interesting bugs? Fragmentation? Streams?

11:08 PM · May 18, 2026 · 1.5K Views

11:22 PM · May 18, 2026 · 840 Views

#819Edward Z. Yang@EZYANG

@typedfemale What kinds of interesting bugs? Fragmentation? Streams?

typedfemale@typedfemale

the CUDA caching allocator is such a great way to create extremely "interesting" bugs for yourself

8:55 PM · May 17, 2026 · 10.9K Views

11:08 PM · May 18, 2026 · 1.5K Views

#819Edward Z. Yang@EZYANG

@typedfemale I have had some "fun" out of bounds bugs where CUDA sanitizer didn't help because all the memory accessed was technically valid 😂

typedfemale@typedfemale

11:22 PM · May 18, 2026 · 840 Views

11:25 PM · May 18, 2026 · 144 Views