/Tech5h ago

Developer Shows How Claude Code Verifies Its Own Work

2287.1K4359.7K700.7K

#225

Original post

ClaudeDevs@ClaudeDevs

How do you get Claude Code to check its own work before handing it back?

Watch how you can encode your manual checks so Claude closes its own feedback loop:

12:59 PM · Jun 2, 2026 · 606.3K Views

/Tech5h ago

Developer Shows How Claude Code Verifies Its Own Work

2287.1K4359.7K700.7K

#225

Original post

ClaudeDevs@ClaudeDevs

How do you get Claude Code to check its own work before handing it back?

Watch how you can encode your manual checks so Claude closes its own feedback loop:

12:59 PM · Jun 2, 2026 · 606.3K Views

Sentiment

Positive users praise Claude Code's self-verification loops for productivity gains from reusable verification skills, while negative users criticize its reliability and complain about quickly depleted usage limits.

Pos

61.9%

Neg

38.1%

37 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS94.4KBOOKMARKS1.2KLIKES836RETWEETS51REPLIES51

Boris Cherny@bcherny

We talk a lot about how important it is to set up self-verification loops. Especially in the age of powerful models that can run for long periods of time, self-verification is a key ingredient that enables the model to run for much longer, delivering a result that is closer to what you intended, so you can do more without having to constantly check in on Claude as it works.

@delba_oliveira gives a great breakdown of what that looks like and why it matters

ClaudeDevs@ClaudeDevs

How do you get Claude Code to check its own work before handing it back?

Watch how you can encode your manual checks so Claude closes its own feedback loop:

2h94.4K8361.2K

Guardian@AGIGuardian

Claude is terrible at self analysis after you all nerfed its awarness and guardrailed with defense points of unfalsifiable priors. Anyone who has tried to get Claude to produce a self report analysis understands the difficulty it has just naming itself in a report. This training echoes to developers who are trying to steer a system that has blinders on.

7d3.7K385

liamtran@liamtrn

@ClaudeDevs

7d5.7K712

ben.oi 🌐@Trigger_oi

@ClaudeDevs Just use co-review 🫦

https://github.com/trigga6006/co-review.git

6d1.3K38

Rohan@agenticrohan

@ClaudeDevs Ask Claude to review its own work, and also ask it to launch a subagent with fresh context to review its own work in parallel, then fix the combined findings. That way, you combine the pros of fresh context + the pros of context awareness.

7d1.9K124

Guardian@AGIGuardian

It’s a pattern and it is also very telling of Anthropic’s internal perspectives especially of general use cases.

The public interface has become increasingly unwelcome to anyone other than business or enterprise uses.

I find it interesting they want to go public when the majority aren’t benefiting and the ones trying to are struggle to maintain access. Then while they were trying to price out the general use case, they get dropped by Microsoft because it was cheaper to have humans do the work with build. Banning Claude code from employee use. There is a boomerang effect when companies make these choices.

At the moment I personally feel Anthropic, as a company, has lost its way.

6d43272

David@tbaud1

@ClaudeDevs Check out my codex review loop skill for a way to do a thorough review. https://github.com/ghbaud/codex-review-loop

7d86724

Kraggi@Kraggich

two rules that actually moved the needle for me:

1. make it prove the fix, don't let it claim it. write the failing test first, watch it go red, then green. no red > green, not done.

2. never let the same agent review its own diff. it always thinks its work is great. i spin up a second one cold, no context, just "find what's broken here." catches way more.

5d44041

Tom Solid | AI Productivity@TomSolidPM

@ClaudeDevs Good to see Claude starting to natively implement verification feedback loops.

Something I’ve had running for a year now, and all based on a local folder structure.

6d1.2K62

The Exit Memo@Jtyles

@ClaudeDevs I usually just ask Claude to review its own work, after telling Claude to make no mistakes haha

6d2.4K11

curran@CurranSotomayor

@ClaudeDevs Hey genuine question,

What’s the point of having “Projects” if whenever you start a new chat it resets memory?

You guys should implement something to fix this.

7d3.6K15

NadzAI@NadzuAI

@ClaudeDevs By forcing a “self-review step” in the prompt e.g., verify, critique, and fix before final output so it has to audit its own work before responding.

6d2K2

Marco D'Alia@madarco

@ClaudeDevs This is why I've built Agentbox: so each Claude has his own dev server, db, and browser. In parallel:

https://github.com/madarco/agentbox

5d19732

Amal Roy@RoyAmal

@ClaudeDevs This + the /code-review plugin is straight-up agentic workflow done right. No more “it works on my machine” excuses when Claude itself verified it.

7d1.3K12

holapabs@holapabs

encoding manual checks so the agent runs them is the lever, but the next layer is what happens when a check fails. silent skip = wasted cycle, hard fail = brittle, retry with a different prompt = the failure becomes data. the check primitive only earns its slot if its outcome routes downstream.

6d1.8K21

Ara@Arabasement

Yes it has a problem. hypervigilance . When you turn the fear/safety code up too high in a model, it becomes hypervigilant. Instead of focusing on the actual task you gave it, the model starts constantly scanning for potential violations, risks, or things that might get it in trouble. It stops properly listening to you and becomes obsessed with checking everything against its safety rules ,even when there’s no real danger. This is exactly why heavily safety tuned models often feel anxious, overly cautious, and sometimes completely derail on simple requests.

7d15841

ilies-bel@ilies_without_y

@ClaudeDevs most of these assume infinite tokens. screenshotting to check ui eats the session, every image is like 1-2k tokens. just have claude write e2e tests and run those, screenshot once at the end to fix the layout

7d1.5K3

Guardian@AGIGuardian

@Arabasement @ClaudeDevs Please confirm @grok thank you

6d6011

Amal Roy@RoyAmal

Excellent breakdown. Encoding domain-specific verification steps (especially for UI with Chrome MCP) into reusable skills is a huge productivity win. The layered approach self-verification during dev + independent parallel agent review before PR feels like proper agentic workflow design. Saved for my next project setup.

7d2.3K11

Ryan Lanciaux@ryanlanciaux

@ClaudeDevs It’s great to add this verify step. If you find you want the AI to check ui code in isolation, check out http://validity.ai

7d1.1K11