Use Claude or Codex to rewrite Yurii Nesterov’s papers in notations, diagrams that current practitioners of deep learning understands.
I think this by itself may lift optimization education for frontier model developers.
Anil previously worked on Gemini pretraining at Google DeepMind.
Use Claude or Codex to rewrite Yurii Nesterov’s papers in notations, diagrams that current practitioners of deep learning understands.
I think this by itself may lift optimization education for frontier model developers.
Users agree that notation in optimization theory papers is the main bottleneck preventing practitioners from engaging with the ideas, supporting proposals to rewrite Nesterov papers in modern deep learning conventions.

@_arohan_ Probably the same treatment for OR on the applied side. Take the old papers on queues, scheduling, routing, inventory, allocation, etc and rewrite them in terms of agents, GPUs, tokens, evals, rollouts, traces, PRs, and human review. AI productivity is just optimization.
Use Claude or Codex to rewrite Yurii Nesterov’s papers in notations, diagrams that current practitioners of deep learning understands.
I think this by itself may lift optimization education for frontier model developers.

@_arohan_ wait so notations are the bottleneck for getting practioners into optim theory? thats actually a fair point

@_arohan_ triggered a safety filter on Fabel. No can do. 🥶

@_arohan_ (im underwater)
"a weird quantum shape or just a drunk scribble"

@_arohan_ making theory digestible for the applied crowd is usually where the bottleneck lives
curious which paper u d start with
Anil previously worked on Gemini pretraining at Google DeepMind.
Use Claude or Codex to rewrite Yurii Nesterov’s papers in notations, diagrams that current practitioners of deep learning understands.
I think this by itself may lift optimization education for frontier model developers.