🚨 New Paper 🚨 ScheduleFree+: Scaling Learning-Rate-Free & Schedule-Free Learning to Large Language Models
A few modifications to Schedule-Free Learning make it completely LR tuning free, and allow it to greatly outperform schedules for long duration training! https://arxiv.org/abs/2605.19095v1




