This SkillOpt paper from Microsoft is a must-read!
(bookmark it)
I was a bit skeptical of the results reported in the paper when I shared it a few days ago.
However, I managed to integrate it into my agent orchestrator and ran a few experiments.
The results are mindblowing.
Essentially, all my agent skills now have a proper testing framework and a way to self-evolve. I have started to improve all my agent skills with this.
One exciting result was when I applied it to my paper-figure-extraction skill, which requires an agent to do multimodal analysis. In particular, it improved quality by +20 points (0.73 → 0.93). I went to see the extracted tables and figures, and I was absolutely stunned by how much better my skill got at the task.
Self-improving AI is in the early days, but I think this work is a clear example of the current ability of agents to self-improve.
In this case, it was skills, but it's not hard to imagine how this scales to optimizing agent patterns, tool use, context engineering efforts, agentic search, workflows, evals, and even the harness itself. I already started with a few of these ideas inspired by SkillOpt.
Stay tuned!