/AI22h ago

Sebastian Raschka highlights study showing LLM-generated repository context files fail to improve AI coding agent performance

Developer-written files outperformed LLM-generated ones due to domain expertise.

71934921.3K115.9K

#154

Original post

Sebastian Raschka@rasbt#154inAI

http://x.com/i/article/2063647807437705216

8:47 AM · Jun 7, 2026 · 102.5K Views

/AI22h ago

Sebastian Raschka highlights study showing LLM-generated repository context files fail to improve AI coding agent performance

Developer-written files outperformed LLM-generated ones due to domain expertise.

71934921.3K115.9K

#154

Original post

Sebastian Raschka@rasbt#154inAI

http://x.com/i/article/2063647807437705216

8:47 AM · Jun 7, 2026 · 102.5K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS9.4KBOOKMARKS71LIKES71RETWEETS3REPLIES5

Emmett Shear@eshear

The quality of the learning outputs vary wildly. There are sentences you can put into AGENTS that will cripple every agent in the repo, others that will supercharge them. It’s wildly powerful; the default mode of having every agent just record whatever it wants is crazy.

Sebastian Raschka@rasbt

http://x.com/i/article/2063647807437705216

15h9.4K7171

Emmett Shear@eshear

The “one crazy trick” that I’ve learned: empirically test your changes by giving a new agent w a fresh context the same task that just failed, and seeing what happens. Then iterating until it actually works.

Emmett Shear@eshear

15h1.3K254

Emmett Shear@eshear

My next approach will be to try to fork the primary process off a checkpoint before it starts trying to learn, but honestly I’m not optimistic. Quality in-context learning is hard.

Emmett Shear@eshear

You can try to tell your agent to do that, but it usually doesn’t work because then they hardcode the answer for this particular case to get the test case to pass.

15h1.2K71

Emmett Shear@eshear

You can try to tell your agent to do that, but it usually doesn’t work because then they hardcode the answer for this particular case to get the test case to pass.

Emmett Shear@eshear

15h2K120

Andrei Bocan 🐐@monsieur_pickle

@rasbt huh even efficiency is higher without instructions. intuitively that doesn’t make a ton of sense, would definitely be interested in how this performs on newer models.

22h52812

Alpha Batcher@alphabatcher

@rasbt AGENTS.md literally good helping tool for coding agents

saved this article for myself

20h1591

Saksham Mishra@SakshamMgc02

@rasbt This matches my experience perfectly. On massive projects with multiple local directories, I keep a root agent markdown file alongside a doc folder with separate .md files for individual features. It handles cross-directory changes beautifully. (1/3)

21h1422

Sebastian Raschka@rasbt

@valeriibo and generic instructions are just inefficient

22h3781

@valerii_arch@valeriibo

@rasbt stale agent instructions are just another source of bugs

22h417

Shuying Luo@shuying_luo

@rasbt I often find that old context file I wrote a few weeks ago is hurting the task I plan to do right now.

Context get obsoleted so quickly. I either need to disciplinedly update them every code change, or I keep them for only the duration of my task.

20h128

Dmitriy Vasilyuk@ReasonMeThis

Really interesting! In practice, you can't really do without context files.

Just one example: in our Littlebird codebase, the agent constantly tried to run tests without exporting the necessary env variables, ran into a db access issue, then did all kinds of ridiculous things to get around it.

Once we documented proper cmds, the silliness stopped.

18h72

Saksham Mishra@SakshamMgc02

@rasbt The root agent definitely eats up more tokens upfront. But once it absorbs that hierarchy, it works completely hands-off and executes complex, cross-cutting changes without me having to feed it context manually. (2/3)

21h28

Anirudh@anieasyy

@ReasonMeThis @rasbt Are your context files individually maintained or shared across the team?

18h21

Emmett Shear@eshear

You can fix that by creating a multiple step process that forces reflection for test hacking. That works sometimes, but often they just gaslight themselves. If you put the check in a fresh context again it won’t cheat, but the checker gets confused bc it lacks context.

Emmett Shear@eshear

You can try to tell your agent to do that, but it usually doesn’t work because then they hardcode the answer for this particular case to get the test case to pass.

15h18020

Daniel.md🛀@dbdanieljnr

@shuying_luo @rasbt What consititute a context file for you? If they are Souls, heartbeat, memory etc and skills sets, are their more to it. If they are just these ones, then why not create a multiagent profile where each profile is an agent with a soul, memory, etc?

18h8

Guillaume Philippe@guphilippee

@rasbt Interesting read ! Result isn't intuitive at all. Being lazy is nice sometimes

22h2291

Aaryan Kakad@aaryan_kakad

@rasbt i think the agents.md file should be only meant to tell the agent harness, the technique or the procedure it needs to go through to implement the new stuff or whatever we do with that repo. basically like a SOP.

so the repo stays consistent and not a mess of slop.

22h1691

Gene Sobolev@genesobolev

@rasbt I always kept very minimal files with up to 50 lines of text. Just general code preferences, project context, and a few procedures that help me to reason through generated code. Anything more required too much maintenance and degraded performance.

21h1361

Sebastian Raschka@rasbt

@monsieur_pickle and newer harnesses / harness versions

22h362

Dmitriy Vasilyuk@ReasonMeThis

@anieasyy @rasbt Mostly shared, but occasionally I add something to my personal global context

17h112