/AI17h ago

Research Finds Mid-Tier Models Best at Evolving AI Agent Skills

446218898043.8K

Comments

#475

Reposts

#475

Original post

elvis@omarsar0#475inAI

This is something I have been thinking about after that @karpathy post on LLM Knowledge Bases. Fine-tuning models for maintaining better agent skills, memory, context engineering, routing efficiency, and knowledge bases is going to be huge.

You might also find this read interesting too:

elvis@omarsar0

Very good advice on self-improving agents.

(bookmark it)

This is something I am seeing in my own experiments with coding agents and harnesses for long-horizon tasks.

What I have found is that stronger models do not always evolve better agents.

The current believe in self-evolving agents is that a bigger model writes better prompt and skill edits, so devs put their best model in the evolver seat.

New research shows that intuition is mostly wrong.

The work separates two abilities that usually get conflated. Producing harness updates stays flat across model capability, so Qwen3.5-9B writes edits roughly as good as Claude Opus 4.6. Benefiting from those updates follows an inverted-U that peaks at mid-tier models, while weak models fail to even activate the edits and strong models have little headroom left.

This is important to understand as it tells you where to spend. Put a cheap model on the evolver and your expensive model on the solver, because the gains land solver-side, not evolver-side.

Paper: https://arxiv.org/abs/2605.30621

Learn to build effective AI agents in our academy: https://academy.dair.ai/

8:28 AM · Jun 1, 2026 · 3K Views

/AI17h ago

Research Finds Mid-Tier Models Best at Evolving AI Agent Skills

--0--

Comments

#475

Reposts

#475

Original post

elvis@omarsar0#475inAI

You might also find this read interesting too:

elvis@omarsar0

Very good advice on self-improving agents.

(bookmark it)

This is something I am seeing in my own experiments with coding agents and harnesses for long-horizon tasks.

What I have found is that stronger models do not always evolve better agents.

The current believe in self-evolving agents is that a bigger model writes better prompt and skill edits, so devs put their best model in the evolver seat.

New research shows that intuition is mostly wrong.

This is important to understand as it tells you where to spend. Put a cheap model on the evolver and your expensive model on the solver, because the gains land solver-side, not evolver-side.

Paper: https://arxiv.org/abs/2605.30621

Learn to build effective AI agents in our academy: https://academy.dair.ai/

8:28 AM · Jun 1, 2026 · 3K Views

Sentiment

Many users praised the research showing mid-tier models optimize self-improving AI agents best, highlighting its practical takeaways for cost efficiency and agent system design.

Pos

82.4%

Neg

17.6%

17 comments with sentiment.

Cluster Engagement

Sentiment

Sentiment unavailable for this story.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

RETWEETS59

elvis@omarsar0

Very good advice on self-improving agents.

(bookmark it)

This is something I am seeing in my own experiments with coding agents and harnesses for long-horizon tasks.

What I have found is that stronger models do not always evolve better agents.

The current believe in self-evolving agents is that a bigger model writes better prompt and skill edits, so devs put their best model in the evolver seat.

New research shows that intuition is mostly wrong.

This is important to understand as it tells you where to spend. Put a cheap model on the evolver and your expensive model on the solver, because the gains land solver-side, not evolver-side.

Paper: https://arxiv.org/abs/2605.30621

Learn to build effective AI agents in our academy: https://academy.dair.ai/

17h40.8K607970