http://x.com/i/article/2063647807437705216
Sebastian Raschka highlights study showing LLM-generated repository context files fail to improve AI coding agent performance
Developer-written files outperformed LLM-generated ones due to domain expertise.
Most Activity
The quality of the learning outputs vary wildly. There are sentences you can put into AGENTS that will cripple every agent in the repo, others that will supercharge them. It’s wildly powerful; the default mode of having every agent just record whatever it wants is crazy.
http://x.com/i/article/2063647807437705216
The “one crazy trick” that I’ve learned: empirically test your changes by giving a new agent w a fresh context the same task that just failed, and seeing what happens. Then iterating until it actually works.
The quality of the learning outputs vary wildly. There are sentences you can put into AGENTS that will cripple every agent in the repo, others that will supercharge them. It’s wildly powerful; the default mode of having every agent just record whatever it wants is crazy.
My next approach will be to try to fork the primary process off a checkpoint before it starts trying to learn, but honestly I’m not optimistic. Quality in-context learning is hard.
You can try to tell your agent to do that, but it usually doesn’t work because then they hardcode the answer for this particular case to get the test case to pass.
You can try to tell your agent to do that, but it usually doesn’t work because then they hardcode the answer for this particular case to get the test case to pass.
The “one crazy trick” that I’ve learned: empirically test your changes by giving a new agent w a fresh context the same task that just failed, and seeing what happens. Then iterating until it actually works.

@rasbt huh even efficiency is higher without instructions. intuitively that doesn’t make a ton of sense, would definitely be interested in how this performs on newer models.

@rasbt AGENTS.md literally good helping tool for coding agents
saved this article for myself

@rasbt This matches my experience perfectly. On massive projects with multiple local directories, I keep a root agent markdown file alongside a doc folder with separate .md files for individual features. It handles cross-directory changes beautifully. (1/3)

@valeriibo and generic instructions are just inefficient

@rasbt stale agent instructions are just another source of bugs

@rasbt I often find that old context file I wrote a few weeks ago is hurting the task I plan to do right now.
Context get obsoleted so quickly. I either need to disciplinedly update them every code change, or I keep them for only the duration of my task.

Really interesting! In practice, you can't really do without context files.
Just one example: in our Littlebird codebase, the agent constantly tried to run tests without exporting the necessary env variables, ran into a db access issue, then did all kinds of ridiculous things to get around it.
Once we documented proper cmds, the silliness stopped.

@rasbt The root agent definitely eats up more tokens upfront. But once it absorbs that hierarchy, it works completely hands-off and executes complex, cross-cutting changes without me having to feed it context manually. (2/3)

@ReasonMeThis @rasbt Are your context files individually maintained or shared across the team?
You can fix that by creating a multiple step process that forces reflection for test hacking. That works sometimes, but often they just gaslight themselves. If you put the check in a fresh context again it won’t cheat, but the checker gets confused bc it lacks context.
You can try to tell your agent to do that, but it usually doesn’t work because then they hardcode the answer for this particular case to get the test case to pass.

@shuying_luo @rasbt What consititute a context file for you? If they are Souls, heartbeat, memory etc and skills sets, are their more to it. If they are just these ones, then why not create a multiagent profile where each profile is an agent with a soul, memory, etc?

@rasbt Interesting read ! Result isn't intuitive at all. Being lazy is nice sometimes

@rasbt i think the agents.md file should be only meant to tell the agent harness, the technique or the procedure it needs to go through to implement the new stuff or whatever we do with that repo. basically like a SOP.
so the repo stays consistent and not a mess of slop.

@rasbt I always kept very minimal files with up to 50 lines of text. Just general code preferences, project context, and a few procedures that help me to reason through generated code. Anything more required too much maintenance and degraded performance.

@monsieur_pickle and newer harnesses / harness versions

@anieasyy @rasbt Mostly shared, but occasionally I add something to my personal global context