This is the natural continuation of this group's previous several works on optimization, and I really like their style.
A lot of experiments to look at all possible details. As opposed to a wall of theory and then one single experiment with untuned baselines.
Our paper is now on arXiv: https://arxiv.org/abs/2606.25971 Besides all the details and discussions of the broader literature, it also contains lots other experiments that answer some of the questions we have already received. For example:




