1d ago

Cohere's Kris Cao argues the classic imitation learning algorithm DAgger has been rebranded as "on-policy distillation

Both concepts rely on iteratively aggregating expert feedback data.

111012.1K

——0——

Original post

When did DAgger get renamed on-policy distillation?

@kroscoo LOL, I didn't make the connection, but this seems to be a correct one. 😅

Kris Cao@kroscoo

When did DAgger get renamed on-policy distillation?

11:57 AM · May 28, 2026 · 1.7K Views

11:43 PM · May 28, 2026 · 531 Views

Cluster engagement