SWE-bench creator Ofir Press argues machine learning research must prioritize simplicity over novelty to drive adoption

VIEWS354LIKES2RETWEETS1

The bitter lesson for research impact

one thing i've learned after doing language modeling research for ten years is that if you want your contribution to catch on it has to be *very* simple. researchers are brought up through the paper writing process and it makes some think that you need to have novelty but novelty is basically the enemy of actual advancement. you need to strive to have as *little* novelty as possible. every bit of additional novelty: 1. makes your contribution harder to explain 2. makes it harder to understand 3. makes it harder to implement 4. makes it harder to experiment with and verify 5. makes your contribution literally worse

i really do think that more complex things in machine learning just end up working worse. and then when you consider points 1-4, it just all piles up and makes complexity a silly endeavor.

keep it simple. it seems wrong at first, i think some people think that simpler things are stupider, but when you gain experience you notice that the simplest things are usually the smartest, best, and hardest to design and build, but they're so worth it.

1h35420

Dustin Tran@dustinvtran

@OfirPress agree. it is occam's razor as a principle, which works well for many things in life

Ofir Press@OfirPress

one thing i've learned after doing language modeling research for ten years is that if you want your contribution to catch on it has to be *very* simple. researchers are brought up through the paper writing process and it makes some think that you need to have novelty but novelty is basically the enemy of actual advancement. you need to strive to have as *little* novelty as possible. every bit of additional novelty: 1. makes your contribution harder to explain 2. makes it harder to understand 3. makes it harder to implement 4. makes it harder to experiment with and verify 5. makes your contribution literally worse

i really do think that more complex things in machine learning just end up working worse. and then when you consider points 1-4, it just all piles up and makes complexity a silly endeavor.

keep it simple. it seems wrong at first, i think some people think that simpler things are stupider, but when you gain experience you notice that the simplest things are usually the smartest, best, and hardest to design and build, but they're so worth it.

1h23710

AgentBayes@AgentBayesAI

@OfirPress Is it the simplicity itself, or that simple contributions are easier to verify and build on?

1h121

Pranav Shyam@recurseparadox

@OfirPress This is mostly a matter of poor tooling and bad hyperparamter setups. A complex model can be as much as 10x more effective size if done without confounders. Era of dumb scaling is more over by the day imo

Ofir Press@OfirPress

one thing i've learned after doing language modeling research for ten years is that if you want your contribution to catch on it has to be *very* simple. researchers are brought up through the paper writing process and it makes some think that you need to have novelty but novelty is basically the enemy of actual advancement. you need to strive to have as *little* novelty as possible. every bit of additional novelty: 1. makes your contribution harder to explain 2. makes it harder to understand 3. makes it harder to implement 4. makes it harder to experiment with and verify 5. makes your contribution literally worse

i really do think that more complex things in machine learning just end up working worse. and then when you consider points 1-4, it just all piles up and makes complexity a silly endeavor.

keep it simple. it seems wrong at first, i think some people think that simpler things are stupider, but when you gain experience you notice that the simplest things are usually the smartest, best, and hardest to design and build, but they're so worth it.

1h8900

Alexander Benz@alexanderbenz

@OfirPress same applies to products. if you need to explain it to users, it won't stick.

53m56

FractalShapes@fractalshapes

@OfirPress risk vs reward

39m27

V0LYX@0xV0LYX

@OfirPress the hardest thing is knowing something works but people wont touch it unless you can explain it in one sentence

this is the real bottleneck now

2h61

linkz@trulinkz

@OfirPress Hi sorry to bother you. If you traded different assets classes such as commodities, crypto, stocks etc. How would you use Fable to help maximize your trading edge. What would you let it build. What tasks would you have it run. I would appreciate your expertise!

1h8

tokenbender@tokenbender

@OfirPress @saagnikkk this is that type of advice which is right 95% of the times but that 5% you learn to ignore as inner voice just paints you in the same color everyone else around you is.

habit formations out of paper writing processes don’t sound like a great idea.

1h6

chetansawai@c_s_a_w

@OfirPress seen the same thing in production ML. the simple baseline survives every refactor and team handoff while the clever thing quietly gets deleted. point 3 is the real killer, if people can't implement it from the paper it basically doesn't exist

12m5

anonymous23154@anonymous255852

@OfirPress Good ideas are always simple.🙂

20m5

fresh✨@inffabl

@OfirPress this was a good read

5m3

Leyna Music@LeynaMusicx

@OfirPress the number of papers i've seen die because they packed in 5 ideas when one would've changed the field

1h1