8h ago

Charles Frye proposed training machine learning models with large amounts of implementation noise to increase robustness against floating-point nondeterminism and nonassociativity

A reply reframed the approach as standard regularization.

β€”β€”0β€”β€”
Original post

my gut says that to solve float numerics problems from nondeterminism x nonassociativity, we need to think bigger than determinism. models could eg be trained with large amounts of "implementation noise" so that the learned network is more robust to implementation skew.

12:50 PM Β· May 22, 2026 View on X

@charles_irl ITS JUST REGULARIZATION

Charles πŸŽ‰ Frye @ MLSysCharles πŸŽ‰ Frye @ MLSys@charles_irl

my gut says that to solve float numerics problems from nondeterminism x nonassociativity, we need to think bigger than determinism. models could eg be trained with large amounts of "implementation noise" so that the learned network is more robust to implementation skew.

7:50 PM Β· May 22, 2026 Β· 3.6K Views
1:22 AM Β· May 23, 2026 Β· 80 Views