Alex Dimakis explains the mechanics of on-policy LLM distillation following a community query by Rishabh Agarwal · Digg