Jürgen Schmidhuber from Technische Universität München outlined an algorithm using self-supervised recurrent neural networks for reinforcement learning and planning in non-stationary environments in his 1990 paper · Digg