/AI3h ago

DARP Retrieval Method Boosts Robot Imitation Learning Up To 200%

1310217848.3K

Original post

Here’s a pretty weird and surprising result - retrieval-augmented generation works unreasonably well for robot learning – but only when parameterized using difference vectors!

We introduce Difference-Aware Retrieval Policies for Imitation Learning (DARP), a simple, semi-parametric RAG architecture for imitation learning that achieves gains of up to 200% over standard behavior cloning. No additional assumptions beyond BC, just a little architecture switch! The theory backing it up is pretty cool too and it works on real robots! :)

Play with our website to understand better: https://weirdlabuw.github.io/darp-site/

🧵(1/7)

10:29 AM · Jun 9, 2026 · 5.7K Views

/AI3h ago

DARP Retrieval Method Boosts Robot Imitation Learning Up To 200%

1310217848.3K

#529

Original post

Abhishek Gupta@abhishekunique7#529inAI

Here’s a pretty weird and surprising result - retrieval-augmented generation works unreasonably well for robot learning – but only when parameterized using difference vectors!

Play with our website to understand better: https://weirdlabuw.github.io/darp-site/

🧵(1/7)

10:29 AM · Jun 9, 2026 · 5.7K Views

Sentiment

Users are excited about the DARP Retrieval Method boosting robot imitation learning up to 200% because of its impressive results and strong undergraduate leadership on the project.

Pos

100.0%

Neg

0.0%

2 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

Abhishek Gupta@abhishekunique7

How do we achieve these surprising results? Instead of learning a global, direct state-to-action mapping, DARP reparameterizes the imitation learning problem through retrieval.

We retrieve the k-nearest expert neighbors and predict actions conditioned on the relative difference vectors between those neighbor states and the query state, aggregating them for a final action prediction. (2/7)

Abhishek Gupta@abhishekunique7

Here’s a pretty weird and surprising result - retrieval-augmented generation works unreasonably well for robot learning – but only when parameterized using difference vectors!

Play with our website to understand better: https://weirdlabuw.github.io/darp-site/

🧵(1/7)

3h80341

BOOKMARKS1LIKES4REPLIES1

Abhishek Gupta@abhishekunique7

But why does DARP provide such large gains just through reparameterization? Theoretically, we show that DARP operates as an implicit manifold regularizer. By embedding neighborhood aggregation directly into the policy architecture, it achieves parameter-free Laplacian smoothing. This causes smooth, low-variance behavior, leading to more stable models and improved performance. (3/7)

Abhishek Gupta@abhishekunique7

How do we achieve these surprising results? Instead of learning a global, direct state-to-action mapping, DARP reparameterizes the imitation learning problem through retrieval.

3h58741

Abhishek Gupta@abhishekunique7

Empirically, this difference-based reparameterization directly mitigates covariate shift.

By shifting the frame of reference to local distance vectors, query states that are globally out-of-distribution (OOD) remain locally in-distribution. This local consistency grants the policy remarkable robustness during closed-loop rollouts. (4/7)

Abhishek Gupta@abhishekunique7

3h15620

Abhishek Gupta@abhishekunique7

Empirically, DARP is dead simple to implement, just a little nearest neighbors + the simple aggregation architecture - no objective change, no data change. Since DARP is easy to scale, you can drop it in for your favorite policy class! Nicely integrates with rich distribution classes like diffusion and works directly from visual inputs! You basically get pretty huge gains across the board with very few changes required. (6/7)

Abhishek Gupta@abhishekunique7

Crucially, DARP remains entirely within the standard behavior cloning regime: no simulators, interactive experts, or online training are required.

DARP sees substantial improvement across architectures and input modalities, and even significantly improves upon diffusion policies in real-world robotic manipulation tasks. (5/7)

3h30410

Abhishek Gupta@abhishekunique7

This is a particularly exciting project because it is led by our amazing undergraduate @quinncomputer . Quinn was a tour-de-force on this project, he pulled it together with very little help from us! We basically stumbled on this pretty surprising result and then spent a bunch of time trying to figure out why it worked. That resulted in some pretty cool theory, worked out by the excellent @siddhss5. Take a look at the website and the paper for more details - and use it in your work and tell us how it does!

This was work published at #ICLR2026, with @khimya, Ethan Pronovost, Paarth Shah, @siddhss5.

Website: https://weirdlabuw.github.io/darp-site/ Paper: https://arxiv.org/abs/2606.09758 Colab: https://colab.research.google.com/drive/1N0kBjaT773HkzESaXw884wmsmEpZJjEy

@quinncomputer is applying for PhD programs this year, don't miss the chance to recruit him! :)

Abhishek Gupta@abhishekunique7

3h27840

Abhishek Gupta@abhishekunique7

Crucially, DARP remains entirely within the standard behavior cloning regime: no simulators, interactive experts, or online training are required.

DARP sees substantial improvement across architectures and input modalities, and even significantly improves upon diffusion policies in real-world robotic manipulation tasks. (5/7)

Abhishek Gupta@abhishekunique7

Empirically, this difference-based reparameterization directly mitigates covariate shift.

3h14100

Max For AI@MaxForAI

@abhishekunique7 非常惊人的结果

2h16