/AI1h ago

Microsoft SkillOpt Paper Enables Self-Evolving AI Agent Skills

--0--
Original posts
Quote posts
Comments
Original post
elvis@omarsar0#475inAI

This SkillOpt paper from Microsoft is a must-read!

(bookmark it)

I was a bit skeptical of the results reported in the paper when I shared it a few days ago.

However, I managed to integrate it into my agent orchestrator and ran a few experiments.

The results are mindblowing.

Essentially, all my agent skills now have a proper testing framework and a way to self-evolve. I have started to improve all my agent skills with this.

One exciting result was when I applied it to my paper-figure-extraction skill, which requires an agent to do multimodal analysis. In particular, it improved quality by +20 points (0.73 → 0.93). I went to see the extracted tables and figures, and I was absolutely stunned by how much better my skill got at the task.

Self-improving AI is in the early days, but I think this work is a clear example of the current ability of agents to self-improve.

In this case, it was skills, but it's not hard to imagine how this scales to optimizing agent patterns, tool use, context engineering efforts, agentic search, workflows, evals, and even the harness itself. I already started with a few of these ideas inspired by SkillOpt.

Stay tuned!

9:07 AM · Jun 3, 2026 · 3.7K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
VIEWS1.3KBOOKMARKS2LIKES3
elvis@omarsar0

In case you were wondering, I have already started to package this into something that's more accessible to others. It still requires thinking of the eval side of things. Learn to do evals, start with agents helping you with this initially, and automate it.

I am actually trying to figure out a way to automate this whole thing so I can run an experiment for how something like this can work on a schedule autonomously.

elvis@omarsar0

This SkillOpt paper from Microsoft is a must-read!

(bookmark it)

I was a bit skeptical of the results reported in the paper when I shared it a few days ago.

However, I managed to integrate it into my agent orchestrator and ran a few experiments.

The results are mindblowing.

Essentially, all my agent skills now have a proper testing framework and a way to self-evolve. I have started to improve all my agent skills with this.

One exciting result was when I applied it to my paper-figure-extraction skill, which requires an agent to do multimodal analysis. In particular, it improved quality by +20 points (0.73 → 0.93). I went to see the extracted tables and figures, and I was absolutely stunned by how much better my skill got at the task.

Self-improving AI is in the early days, but I think this work is a clear example of the current ability of agents to self-improve.

In this case, it was skills, but it's not hard to imagine how this scales to optimizing agent patterns, tool use, context engineering efforts, agentic search, workflows, evals, and even the harness itself. I already started with a few of these ideas inspired by SkillOpt.

Stay tuned!

1hViews 1.3KLikes 3Bookmarks 2
REPLIES2
elvis@omarsar0

I love this figure from the paper. It does a really good job explaining how it all works.

elvis@omarsar0

In case you were wondering, I have already started to package this into something that's more accessible to others. It still requires thinking of the eval side of things. Learn to do evals, start with agents helping you with this initially, and automate it.

I am actually trying to figure out a way to automate this whole thing so I can run an experiment for how something like this can work on a schedule autonomously.

53mViews 1.1KLikes 3Bookmarks 2