HTML Artifacts Improve AI Agent Verification in Long Workflows
Increasingly, HTML Artifacts are becoming a core part of how I work with AI agents.
Long-horizon agent sessions need a better way to surface insights about what work it has done.
This may not be obvious right now, but as you start to let your agent work on dynamic workflows, large codebases, long-running loops (e.g., using /goal), and deep research tasks, you need a good way to present results. Chat window is not it.
You also don't want to just trust everything the agents do. Artifacts help provide an important verification layer, which in turn enables important decision-making.
I like HTML artifacts because I can just ask the agent to produce as many of them (and in whatever form) as I need to verify the work and make sense out of everything. I even built a nice tab system for my artifacts. They are great for continual learning and research.
I use HTML artifacts for logging, tracking experiments, brainstorming, managing my inbox, code reviews, agent session management, deep research, writing, reading, and so much more.
I believe @karpathy wrote about this somewhere: As we move on to more advanced applications of AI agents and outputs get more complex, we will start to find the need for even more advanced forms of interactions with AI, including interactive neural videos/simulations.
I did a talk on LLM Wikis and HTML artifacts recently, if you are curious to learn more on the topic: https://academy.dair.ai/events/cmovobp97000904l5h0n9a2yz Doing a second session and a few releases on our platform around artifacts.
Increasingly, HTML Artifacts are becoming a core part of how I work with AI agents. Long-horizon agent sessions need a better way to surface insights about what work it has done. This may not be obvious right now, but as you start to let your agent work on dynamic workflows, large codebases, long-running loops (e.g., using /goal), and deep research tasks, you need a good way to present results. Chat window is not it. You also don't want to just trust everything the agents do. Artifacts help provide an important verification layer, which in turn enables important decision-making. I like HTML artifacts because I can just ask the agent to produce as many of them (and in whatever form) as I need to verify the work and make sense out of everything. I even built a nice tab system for my artifacts. They are great for continual learning and research. I use HTML artifacts for logging, tracking experiments, brainstorming, managing my inbox, code reviews, agent session management, deep research, writing, reading, and so much more. I believe @karpathy wrote about this somewhere: As we move on to more advanced applications of AI agents and outputs get more complex, we will start to find the need for even more advanced forms of interactions with AI, including interactive neural videos/simulations.