12d ago

Omar Khattab signals new variance reduction method surpassing state of the art

0

Omar Khattab, assistant professor at the MIT CSAIL NLP group, signaled that a new variance reduction method will surpass the current state-of-the-art result within approximately one day. Research engineer Will Brown responded to the approach, describing it as effective within a constant factor and expressing interest in examining the details further. The exchange occurred in a reply thread discussing optimization techniques in machine learning.

Original post

@willccbb until ~tomorrow

will brownwill brown@willccbb

SOTA method for variance reduction:

9:36 PM · May 4, 2026 · 11.8K Views
9:39 PM · May 4, 2026 · 4.7K Views

@willccbb "~Tomorrow" ~= 10 days, give or take - right?

Souradip ChakrabortySouradip Chakraborty@SOURADIPCHAKR18

🚨Typical RL algorithms and on-policy distillation methods are blind samplers: they use privileged info to score rollouts, but not to *find* them. We ask: can we use privileged info to *actively sample* the rollouts RL wishes it can stumble upon with compute? ⤵️ Pedagogical RL

10:46 PM · May 14, 2026 · 81.9K Views
11:10 PM · May 14, 2026 · 1.4K Views

@lateinteraction constant factor, close enough :) veeeery nice approach, really excited to dig into it further!

Omar KhattabOmar Khattab@lateinteraction

@willccbb "~Tomorrow" ~= 10 days, give or take - right?

11:10 PM · May 14, 2026 · 1.4K Views
11:11 PM · May 14, 2026 · 5.8K Views
Omar Khattab signals new variance reduction method surpassing state of the art · Digg