Taco Cohen's mathematical proof equating KL-regularized reinforcement learning to variational inference's ELBO resurfaces in AI discussions · Digg