/AI4h ago

Instructed-Retriever-1 Matches Claude Sonnet 4.5 Retrieval Quality With Lower Latency

8461782.5K

Original posts

#954

Quote posts

Comments

#954

Reposts

#954

Original post

Andrew Drozdov#954

Databricks AI Research@DbrxMosaicAI

Most agentic search systems get better by thinking longer: more tool calls, more reason-act loops, each step waiting on the last. Quality goes up, but so does latency.

Instructed-Retriever-1 takes a different route. Instead of scaling test-time compute sequentially, it scales it in parallel. One retrieval-specialized model fans the work out: it generates multiple query and filter formulations to widen recall, then reranks the merged evidence with a multi-pivot reranker to sharpen precision. Both stages run at once, so searching more broadly no longer means searching more slowly.

The result inside Knowledge Assistant: search time drops more than 3x and answer time 2x, with time to first token around two seconds, and no drop in quality (it matches Claude Sonnet 4.5 retrieval quality on KARLBench). For the people using it, that means far less waiting between question and answer, the freedom to ask more follow-ups, and more of the knowledge base actually surfaced. Rolling out to all customers now, with no reconfiguration.

Read how we did it: https://www.databricks.com/blog/3x-faster-search-parallel-test-time-scaling-instructed-retriever-1

9:46 AM · Jun 4, 2026 · 1.3K Views

/AI4h ago

Instructed-Retriever-1 Matches Claude Sonnet 4.5 Retrieval Quality With Lower Latency

--0--

Original posts

#954

Quote posts

Comments

#954

Reposts

#954

Original post

Andrew Drozdov#954

Databricks AI Research@DbrxMosaicAI

Most agentic search systems get better by thinking longer: more tool calls, more reason-act loops, each step waiting on the last. Quality goes up, but so does latency.

Read how we did it: https://www.databricks.com/blog/3x-faster-search-parallel-test-time-scaling-instructed-retriever-1

9:46 AM · Jun 4, 2026 · 1.3K Views

Sentiment

Positive users highlight benefits of the multi-pivot groupwise reranker in Databricks Instructed-Retriever-1 for adding useful context, while negative users see the parallel scaling approach as unoriginal and overhead-prone.

Pos

50.0%

Neg

50.0%

2 comments with sentiment.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

VIEWS710BOOKMARKS2LIKES23RETWEETS9

Andrew Drozdov@mrdrozdov

New Product Update: We trained a retrieval-specialized model for Knowledge Assistant. It matches Claude Sonnet 4.5 retrieval quality at substantially lower latency.

Introducing Instructed-Retriever-1.

3h710232

REPLIES2

Andrew Drozdov@mrdrozdov

I’m particularly excited about our multi-pivot groupwise reranker. Thoughtfully adding more context to each reranking call can really pay off.

Andrew Drozdov@mrdrozdov

The search harness for Instructed-Retriever-1 makes heavy use of Parallel Test-Time Scaling. By spending more compute in parallel, we expose several knobs for improving quality while keeping latency low.

Posts from X

Most Activity

VIEWS710BOOKMARKS2LIKES23RETWEETS9

Andrew Drozdov@mrdrozdov

New Product Update: We trained a retrieval-specialized model for Knowledge Assistant. It matches Claude Sonnet 4.5 retrieval quality at substantially lower latency.

Introducing Instructed-Retriever-1.

3h710232

REPLIES2

Andrew Drozdov@mrdrozdov

I’m particularly excited about our multi-pivot groupwise reranker. Thoughtfully adding more context to each reranking call can really pay off.

Andrew Drozdov@mrdrozdov

3h7220