
Benchmarks get reward-hacked constantly. KernelBench alone has a long list: monkey-patching timing functions, caching reference outputs, etc. Fixing these is painful and new exploits keep surfacing. But if agents can find exploits, can they automatically patch them too? (1/8)

