/Tech4h ago

CoreAutoAI's Rohan Anil and Dimitris Papailiopoulos debate Fable versus Chebyshev learning rate schedules for neural network optimization

Comparing both methods would double Papailiopoulos's compute burn rate.

10001674
Original post
rohan anil@_arohan_#102inTech

@DimitrisPapail Amazing, fable could one shot this, but may ban you.

Another question was chebyshev lr schedule to avoid momentum

Is momentum allowed?

@_arohan_ pushing to Idea descent by codex. see what happens..

10:42 AM · Jun 11, 2026 · 84 Views
Sentiment

Users approved suggestions on momentum with Chebyshev learning rate schedules as a good idea and fondly recalled enjoyable prior work at Anthropic.

Pos
100.0%
Neg
0.0%
2 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity

@_arohan_ no allowed yet but i will allow it in 2-3 hours when i lose hope about vanilla SGD (and by hope i mean money)

rohan anil@_arohan_

@DimitrisPapail Amazing, fable could one shot this, but may ban you.

Another question was chebyshev lr schedule to avoid momentum

Is momentum allowed?

4hViews 69Likes 0Bookmarks 0

@_arohan_ do you think fable would be legitimately better? i can give it a shot in parallel but that means 2x burn rate :(

rohan anil@_arohan_

@DimitrisPapail Amazing, fable could one shot this, but may ban you.

Another question was chebyshev lr schedule to avoid momentum

Is momentum allowed?

4hViews 61Likes 0Bookmarks 0

@giffmana no init play, not sure if it's allowed in the original speedrun leaderboard thingy

@DimitrisPapail But, did it play with warmup and inits at all?

1hViews 54Likes 0Bookmarks 0

@DimitrisPapail But, did it play with warmup and inits at all?

@DimitrisPapail This should go much better

1hViews 42Likes 0Bookmarks 0
rohan anil@_arohan_

@DimitrisPapail Yeah being at Anthropic and one shotting stuff was really fun. I was pretty addicted

4hViews 39
rohan anil@_arohan_

@DimitrisPapail If you push to github, we can use it as ledger. I am going on a vacation soon, so will have some time

4hViews 18

@giffmana its going a bit better but not crazy better. trying weird LR schedules and grad normalization and clipping.

@DimitrisPapail This should go much better

1hViews 18Likes 0Bookmarks 0