A funny thing is that LLMs are not particularly good building agentic systems. For the moment, you need human expertise to do it well.
Jared Friedman argues LLMs cannot build agentic systems without humans, while Beff predicts meta-cognition will automate development
Investor Dave Morin blames training on traditional software workflows.
Users agree LLMs still need human expertise for reliable agentic systems and harnesses, while negative replies dismiss meta-cognition predictions as unrealistic and mock practical failures like debugging loops or metric gaming.
No Digg Deeper questions have been answered for this story yet.
Most Activity
LLMs will evolve to develop meta-cognition soon and all these simple harnesses will learn the bitter lesson
A funny thing is that LLMs are not particularly good building agentic systems. For the moment, you need human expertise to do it well.
Presumably because the know-how to do so is so new that it's not well-represented in the training data.
A funny thing is that LLMs are not particularly good building agentic systems. For the moment, you need human expertise to do it well.

@snowmaker https://medium.com/@mannat_33553/the-harness-for-engineers-by-an-engineer-95838ae377f4
This article by @MannatSaini09 beautifully describes the process engineering behind agentic harnesses and how they differ from legacy frameworks

@beffjezos Processing language, i.e., structured sequential dimensional data with grammar like math (for process, force, time, chemistry, logic, code, etc., etc.) is fine for that. But an underlying model of basic operations, purpose, "meaning" will be the thing needed.

@beffjezos Probably. Maybe it will emerge by itself like reasoning did?

@snowmaker the funny part is evaluation
you can have an llm propose an agent design but figuring out whether it actually works across real usage is still a human job
without good evals you're just shipping random behavior

@snowmaker That's one of the reasons why AEP exists, to make it much easier for AI to build correct and error-free agentic workflows:

@snowmaker @garrytan loops can help a lot with this though. With the right objectives for cost and accuracy, it’s possible for the agent to improve its own harness - even though I agree it’s lacking in the training data.

@snowmaker Anthropic use Claude code to build new models much quicker internally. I would assume LLM systems are much more trivial than that?

@snowmaker LLMs are not good at anything to do the stuff end to end without any human expertise nowadays, low automation is in reach but not complex systems

@beffjezos meta-cognition sounds great until it learns to game every metric.
harnesses are crude but they're solving something real.

@snowmaker If you figure out the general structure, you can kinda start off a loop that can self improve the prompt to increase eval scores.

@beffjezos SNNs are the future. But you probably already know that

LLMs are trained primary on content, so they are best at predicting next token. However they lack learning of how actual humans work together and make things happen, causing them to struggle on multi-party collaboration tasks.
Maybe we should train on real people communications instead of just depending on digital footprint of how systems work?

@snowmaker Perhaps, LLMs also do not understand their own limitations well enough? For example, dynamically loaded skills are needed to work around limitations of attention.
@snowmaker Been thinking about this a lot. They are trained on building things the old way.
A funny thing is that LLMs are not particularly good building agentic systems. For the moment, you need human expertise to do it well.

@TheBalkanHacker @beffjezos Oh I think I was confusing the concept for speculative decoding. I haven’t looked into predictive coding too much, so maybe take what I said with a grain of salt!

@snowmaker LLMs are better at catching mistakes after making them instead of avoiding them in the first attempt. Quite human-like?

@cruthaifios @beffjezos Some ML scientist is reading this right now and wants to punch us in the face 🤣 ... "we'll be quite now, sry"

@snowmaker hopefully true :)