/Tech11h ago

Jared Friedman argues LLMs cannot build agentic systems without humans, while Beff predicts meta-cognition will automate development

Investor Dave Morin blames training on traditional software workflows.

1879455617075.8K

#530

Original post

Jared Friedman@snowmaker#1464inTech

A funny thing is that LLMs are not particularly good building agentic systems. For the moment, you need human expertise to do it well.

11:24 PM · Jun 25, 2026 · 58.5K Views

Sentiment

Users agree LLMs still need human expertise for reliable agentic systems and harnesses, while negative replies dismiss meta-cognition predictions as unrealistic and mock practical failures like debugging loops or metric gaming.

Pos

42.3%

Neg

57.7%

21 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS13.7KBOOKMARKS24LIKES175RETWEETS6REPLIES28

Beff (e/acc)@beffjezos

LLMs will evolve to develop meta-cognition soon and all these simple harnesses will learn the bitter lesson

Jared Friedman@snowmaker

A funny thing is that LLMs are not particularly good building agentic systems. For the moment, you need human expertise to do it well.

11h13.7K17524

Jared Friedman@snowmaker

Presumably because the know-how to do so is so new that it's not well-represented in the training data.

Jared Friedman@snowmaker

A funny thing is that LLMs are not particularly good building agentic systems. For the moment, you need human expertise to do it well.

11h3.3K848

Tejas Kumar@TejasKumarrr

@snowmaker https://medium.com/@mannat_33553/the-harness-for-engineers-by-an-engineer-95838ae377f4

This article by @MannatSaini09 beautifully describes the process engineering behind agentic harnesses and how they differ from legacy frameworks

11h20374

Philip McBride@monadical

@beffjezos Processing language, i.e., structured sequential dimensional data with grammar like math (for process, force, time, chemistry, logic, code, etc., etc.) is fine for that. But an underlying model of basic operations, purpose, "meaning" will be the thing needed.

11h1541

Seb@s_a_c99

@beffjezos Probably. Maybe it will emerge by itself like reasoning did?

11h153

Rameswar@rameswar08

@snowmaker the funny part is evaluation

you can have an llm propose an agent design but figuring out whether it actually works across real usage is still a human job

without good evals you're just shipping random behavior

9h1371

the.PM@thePM_001

@snowmaker That's one of the reasons why AEP exists, to make it much easier for AI to build correct and error-free agentic workflows:

8h231

Allie Harris@_AllieHarris

@snowmaker @garrytan loops can help a lot with this though. With the right objectives for cost and accuracy, it’s possible for the agent to improve its own harness - even though I agree it’s lacking in the training data.

4h141

Jerome Israel@jerome_fletcher

@snowmaker Anthropic use Claude code to build new models much quicker internally. I would assume LLM systems are much more trivial than that?

9h303

Yash@yashetal

@snowmaker LLMs are not good at anything to do the stuff end to end without any human expertise nowadays, low automation is in reach but not complex systems

11h6613

Ferbin@Ferbin08

@beffjezos meta-cognition sounds great until it learns to game every metric.

harnesses are crude but they're solving something real.

11h81

Vyomkesh@vyomkesh123

@snowmaker If you figure out the general structure, you can kinda start off a loop that can self improve the prompt to increase eval scores.

10h47

Cruthaifios@cruthaifios

@beffjezos SNNs are the future. But you probably already know that

6h42

vimwaves@vimwaves

LLMs are trained primary on content, so they are best at predicting next token. However they lack learning of how actual humans work together and make things happen, causing them to struggle on multi-party collaboration tasks.

Maybe we should train on real people communications instead of just depending on digital footprint of how systems work?

9h126

Titus von der Malsburg@tmalsburg

@snowmaker Perhaps, LLMs also do not understand their own limitations well enough? For example, dynamically loaded skills are needed to work around limitations of attention.

10h36

Dave Morin 🦞@davemorin

@snowmaker Been thinking about this a lot. They are trained on building things the old way.

Jared Friedman@snowmaker

A funny thing is that LLMs are not particularly good building agentic systems. For the moment, you need human expertise to do it well.

2h29920

Cruthaifios@cruthaifios

@TheBalkanHacker @beffjezos Oh I think I was confusing the concept for speculative decoding. I haven’t looked into predictive coding too much, so maybe take what I said with a grain of salt!

3h71

Saad Gul@SaadGul10

@snowmaker LLMs are better at catching mistakes after making them instead of avoiding them in the first attempt. Quite human-like?

10h75

GrumpyBalkanSoftwareEngineer@TheBalkanHacker

@cruthaifios @beffjezos Some ML scientist is reading this right now and wants to punch us in the face 🤣 ... "we'll be quite now, sry"

3h51

Saïd Aitmbarek@SaidAitmbarek

@snowmaker hopefully true :)

7h56