Self-play regularized with 30 minutes of human data produces human-like autonomous driving policies · Digg