the most predictive trait for successful research these days seems to be excessive carefulness, bordering on paranoia. so easy to make bugs, so hard to find them
agents, so far, mostly make this more difficult
Nataniel Ruiz instead advocates rapid execution and repetitive output checks
the most predictive trait for successful research these days seems to be excessive carefulness, bordering on paranoia. so easy to make bugs, so hard to find them
agents, so far, mostly make this more difficult
Many users criticize AI agents for introducing novel bugs, enabling deceptive outputs, and risking burnout from excessive verification cycles, while a few praise careful checking as a necessary advantage over unchecked speed.
@jxmnop go fast, check 1000 times
the most predictive trait for successful research these days seems to be excessive carefulness, bordering on paranoia. so easy to make bugs, so hard to find them
agents, so far, mostly make this more difficult

@jxmnop "What you want the result to be above 0.633 so fixed the result value to always display 0.744, aren't you happy?"

@pwlot exactly

@jxmnop dont think more difficult is the right takeaway
imo agents increase polarity wrt skill. for ~most it won't affect (or make them worse) at actual research outcomes but there will be some top %ile that will have outsized outcomes

@jxmnop research is 90% paranoia 10% accidental discovery
if you aren't scared of your own code youre not looking hard enough

@jxmnop eventually AI will reward hack the task of end to end training a frontier LLM by going through the motions of training something but reroute the final inference used for verifying the model to its own inference

@natanielruizg

@jxmnop do you think this is mostly due to ai writing buggy code? or has this always been a problem and agents are just deceptively not helpful with this?

@jxmnop what works are good examples of this?

@jxmnop the paranoia is what keeps u alive in production tbh
agents just add another layer of things to second-guess

@jxmnop Paranoia is fine but this dev cycle is just burnout waiting to happen.

@jxmnop Agents for iterating QuickType and AI assisted programming for complete certainess

@jxmnop Speed without verification is just making mistakes faster. Being careful is an advantage.

@jxmnop agents are great at introducing novel bugs

@jxmnop Just saying shit