> it may turn out that the only way to confidently evaluate misalignment in an AI agent at a 1-year horizon is to actually run the agent for a yea
this is a bit confusing imo, AI agent time is quite different from human time, 1 year horizon task is quite different from running the agent for 1y no?
you can probably find a hardware/parallelism config that optimizes speed for very long evals, or even tradeoff sequential test time compute with parallel test time compute? (but then it's a bit different i agree)
also output token is not perfect for things like autoresearch, a big portion of the time is actually spent in "tool call" which here are training runs
http://x.com/i/article/2057694226981257216


