Jürgen Schmidhuber from Technische Universität München outlined an algorithm using self-supervised recurrent neural networks for reinforcement learning and planning in non-stationary environments in his 1990 paper
——0——
Kangwook Lee shared a highlighted page screenshot on social media.
@Kangwook_Lee 😂
And guess what ;-) "The algorithm may concurrently learn the model and learn to pursue the main goal."
11:57 PM · May 18, 2026 · 329 Views
11:58 PM · May 18, 2026 · 77 Views
And guess what ;-)
"The algorithm may concurrently learn the model and learn to pursue the main goal."

World-action models seem to be emerging across different domains, which is super cool! Action model = learn to control World model = learn what control does World-action model = learn both together Robotics: https://arxiv.org/pdf/2602.15922 Terminal agents: ↓
11:53 PM · May 18, 2026 · 4.6K Views
11:57 PM · May 18, 2026 · 329 Views