19h ago

Study Finds First Hard Distractors Cause Most Long-Context LLM Damage

0
Original post

A long-context AI can be poisoned by a few plausible wrong passages, not gradually worn down by many. At just 10% bad context, the damage is already almost done. “THE FIRST DROP OF INK ” effect, analogous to how a single drop of ink contaminates water. The mistake is to picture context as storage. In a long prompt, the model is not calmly filing facts into separate boxes; it is running a competition over which pieces of text deserve attention when the answer is generated. Hard distractors are dangerous because they are not random junk. They are close enough to the question to look useful, but wrong enough to pull the model away from the gold evidence. In the authors’ setup, if performance loss were proportional, the first 10% of hard distractors would explain about 10% of the total damage, but in one 128K-token Qwen2.5 setting it explained 58%. The mechanism is simple once you see it: softmax attention rewards relative closeness, so a misleading passage that sits near the answer in logit space can crowd the denominator far more than irrelevant filler. At only 10% hard distractors, they can already account for about 97% of the distractor pressure. This also changes how we should read filtering results. If removing documents helps, the benefit may come less from removing “bad” content than from shortening the whole battlefield. For long-context systems, the safest misleading passage is the one that never enters the prompt. --- Link – arxiv .org/abs/2605.10828 Title: "The First Drop of Ink: Nonlinear Impact of Misleading Information in Long-Context Reasoning"

10:22 AM · May 26, 2026 View on X
Reposted by