@gneubig @parthsh_ We call the lower bounds in opt theory 😎 still didn’t go very close to Muon.
Graham Neubig@gneubig
@parthsh_ @DimitrisPapail I think that's fine-tuning the parameters of SGD on the test set without tuning the baseline parameters? 😅
6:33 PM · Jun 14, 2026 · 162 Views