Susan Zhang@suchenzang
i guess i'm questioning two things:
% steps saved as a good metric (doesn't control for fixed amount of data)
and whether there's meaningful problems to tackle where data can't be scaled (if so, should either scale data or reconsider if its a good gradient descent problem to tackle with a compute advantage)
1:00 PM · Apr 28, 2026 · 6.5K Views