/AI10h ago

LangChain Labs and Harvey use DeepSeek v4 Flash to cut agent verification costs by 1,000x

Batch scoring achieved 96% agreement with frontier models

363425236162.7K

Original posts

Quote posts

#739

Reposts

#739

Original post

Harrison Chase#739

LangChain@LangChain

http://x.com/i/article/2061784083839983616

10:36 AM · Jun 2, 2026 · 31.5K Views

/AI10h ago

LangChain Labs and Harvey use DeepSeek v4 Flash to cut agent verification costs by 1,000x

Batch scoring achieved 96% agreement with frontier models

--0--

Original posts

Quote posts

#739

Reposts

#739

Original post

Harrison Chase#739

LangChain@LangChain

http://x.com/i/article/2061784083839983616

10:36 AM · Jun 2, 2026 · 31.5K Views

Sentiment

Users are excited about LangChain's efficient verifiers for legal AI agents because they cut evaluation costs tenfold and support scaling evals and training.

Pos

100.0%

Neg

0.0%

4 comments with sentiment.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

VIEWS13.2KBOOKMARKS134LIKES123RETWEETS16

Harvey@harvey

Can we design legal agent verifiers that are up to 1,000x cheaper?

Verifiers are LLM judges that check an agent’s work against rubric criteria: they're used both in agent benchmarking and as reward signal in post-training.

But verifiers can be a bottleneck at scale.

For example, our Legal Agent Benchmark (LAB), comprising 1,200+ legal tasks across 24 different practice areas, requires grading an average of 50+ rubric criteria per answer.

We partnered with @LangChain Labs to design more efficient verifiers for LAB, comparing batch vs per-criterion scoring and open/cost-efficient models against Opus 4.7.

The results were surprising:

DeepSeek v4 Flash preserved much of the Opus 4.7 verifier signal with 94-96% agreement, between batch mode and per-criterion mode.

This came with a massive reduction in cost: 18x cheaper on per-criterion verification, and ~1,000x cheaper on batch verification.

In an RL setting with 3,200 rollouts, the cost of verification drops from $18,000 to $18.

9h13.2K123134

REPLIES7

Harrison Chase@hwchase17

Verifiers are important for scaling evals/RL

But costs add up! So can we make them cheaper?

Some great work by @Vtrivedy10 @jakebroekhuizen in conjunction with @nikogrupen @gabepereyra and the Harvey team on this

LangChain@LangChain

http://x.com/i/article/2061784083839983616

9h6.3K5049