Reinforcement learning pioneer Richard Sutton argues that supervised LLMs cannot achieve original discovery because they only mimic training inputs · Digg