Training on User Interruptions Builds AI Agent Robustness to Recency Bias
Most Activity
this secretly builds pressure towards a lot of invariances at once, like domain randomization in robotics - context rot exists in a ton of task A + task B combos, and if you train on interrupts at random points on tasks of different lengths, macro deferred attention skills emerge
you can just train on "user interrupts the agent mid-task, asks for an unexpected new problem to be solved before returning to the original" for learning robustness against recency bias btw
>post analogy about how my thing is (spiritually) like domain randomization >then immediately see that @yacineMTB retweeted the main post lol
this secretly builds pressure towards a lot of invariances at once, like domain randomization in robotics - context rot exists in a ton of task A + task B combos, and if you train on interrupts at random points on tasks of different lengths, macro deferred attention skills emerge

@kalomaze env hub link?