Google DeepMind's Pranav Shyam argues short-horizon PPO reduces to bandit problems, but open-source developer Grad disputes the latency impact · Digg