14h ago

Grep Beats Vector Retrieval In Agent Harnesses For Fact Recovery

0
Original post

Is Grep All You Need? The surprising result is not that grep is powerful, but that agent design makes it powerful. The paper says not that grep beats vectors, but that agents fail or win through their harness. That sounds like a small distinction until you look at what was actually tested. The authors compare grep-style search and vector retrieval across LongMemEval tasks, where agents must recover facts from long conversation histories full of distractors. Inline grep beats inline vector across every harness-model pair in their main experiment, sometimes by wide margins. The tempting headline is that vector databases are overbuilt for coding agents. The better reading is sharper: when the answer is anchored in literal evidence, names, dates, file paths, function names, error strings, user preferences, grep gives the model a clean mechanical advantage. Embeddings are built to tolerate paraphrase, but tolerance has a cost. They can pull in semantically nearby clutter, especially when a short agent query is vague. Grep has the opposite failure mode. It is dumb, cheap, and narrow, but when the agent knows the right string to hunt for, dumb becomes a feature. The deeper finding is that retrieval is not a component you can benchmark in isolation. The same search method behaves differently depending on whether results are injected inline, written to files, routed through a CLI, or wrapped in a custom agent loop. So the question is not “Do we still need vector databases?” The question is whether your agent is solving a semantic discovery problem or an evidence-location problem. For coding agents, a surprising amount of work is evidence-location: find the symbol, trace the call, inspect the diff, read the failing test, recover the exact line. Vectors still matter at scale and for fuzzy conceptual search, but this paper weakens the lazy default that every serious agent stack begins with embeddings. Sometimes the upgrade is not a smarter index. Sometimes it is giving the model primitive tools, clean files, disciplined context, and a harness that lets exact search do exact work. ---- Paper Link – arxiv. org/abs/2605.15184 Paper Title: "Is Grep All You Need? How Agent Harnesses Reshape Agentic Search"

5:49 AM · May 17, 2026 View on X

There’s an open question on whether grep is all you need for agentic search.

This recent paper by @PwCUS (Sen et al.) seems to suggest that. It’s titled “Is Grep All You Need? How Agent Harnesses Reshape Agentic Search”. They test various agentic harnesses (in-house, Claude Code, Codex), and equip the agent with both vector search and grep.

They find that grep generally yields higher accuracy than semantic search.

IMO the main gap of the paper is that it tests retrieval over conversational memory, not over a real-world corpus of enterprise documents. Standard enterprise RAG setups involve asking complex questions over a static document corpus (e.g. 10-Ks, legal contracts, SOPs). The corpus here is per-user chat history, which is quite a different document distribution.

I do think that evolving agentic harnesses simplify the problem of retrieval - hence the popularity with file sandboxes and a vector db is “just a database” - but IMO there’s still more work to be done here.

Paper: https://arxiv.org/pdf/2605.15184

6:21 PM · May 17, 2026 · 4.4K Views

Is Grep All You Need?

The surprising result is not that grep is powerful, but that agent design makes it powerful.

The paper says not that grep beats vectors, but that agents fail or win through their harness.

That sounds like a small distinction until you look at what was actually tested.

The authors compare grep-style search and vector retrieval across LongMemEval tasks, where agents must recover facts from long conversation histories full of distractors. Inline grep beats inline vector across every harness-model pair in their main experiment, sometimes by wide margins.

The tempting headline is that vector databases are overbuilt for coding agents.

The better reading is sharper: when the answer is anchored in literal evidence, names, dates, file paths, function names, error strings, user preferences, grep gives the model a clean mechanical advantage.

Embeddings are built to tolerate paraphrase, but tolerance has a cost. They can pull in semantically nearby clutter, especially when a short agent query is vague.

Grep has the opposite failure mode. It is dumb, cheap, and narrow, but when the agent knows the right string to hunt for, dumb becomes a feature.

The deeper finding is that retrieval is not a component you can benchmark in isolation. The same search method behaves differently depending on whether results are injected inline, written to files, routed through a CLI, or wrapped in a custom agent loop.

So the question is not “Do we still need vector databases?”

The question is whether your agent is solving a semantic discovery problem or an evidence-location problem.

For coding agents, a surprising amount of work is evidence-location: find the symbol, trace the call, inspect the diff, read the failing test, recover the exact line.

Vectors still matter at scale and for fuzzy conceptual search, but this paper weakens the lazy default that every serious agent stack begins with embeddings.

Sometimes the upgrade is not a smarter index.

Sometimes it is giving the model primitive tools, clean files, disciplined context, and a harness that lets exact search do exact work.

----

Paper Link – arxiv. org/abs/2605.15184

Paper Title: "Is Grep All You Need? How Agent Harnesses Reshape Agentic Search"

12:49 PM · May 17, 2026 · 10.1K Views
Grep Beats Vector Retrieval In Agent Harnesses For Fact Recovery · Digg