PyTorch caching allocator produces exact deterministic memory allocations across runs when record stream usage is disabled, removing nondeterminism in GPU handling for distributed training
Technical exchange confirms approach feasibility for reproducibility goals.
——0——
@ezyang interesting, ok thanks i'll give it a try
@typedfemale So it's definitely feasible to have the exact allocations from the caching allocator be deterministic run to run. This mostly boils down to not using record stream. I would check on this!
11:29 PM · May 18, 2026 · 134 Views
11:30 PM · May 18, 2026 · 128 Views