9h ago

A Medium blog post on Shannon and Kolmogorov limits argues reinforcement learning from interventions outperforms supervised learning on static data when paired with world modeling for counterfactuals

Transformers hit scaling ceilings that parameter growth alone cannot fix.

0
Original post

Very well written blog. I think of RL as learning from interventions, and it kinda explains why it's more powerful as a paradigm than supervised learning. Now learning from counterfactuals is something we haven't been historically good at but maybe world modelling+ RL can get us there.

2:54 PM · May 22, 2026 View on X
QUOTE POSTswyx#214swyx@SWYX

co-sign. a very handy mental framework for what kinds of learning transformers do well today, and why it runs into limitations. when @ankit2119 and i wrote about the need for adversarial world models earlier this year, we were describing a couple of the functions of these rungs of thinking that bring us ever closer to the kolmogorov-limit generator of reality. throwing more params, more power, more everything at a demonstrably inefficient paradigm will be outclassed by the simple solution that can hypothesize and seek truth rather than backfit a house of cards - although the bitter lesson is it is simpler to scale and we may hit agi anyway because human intelligence just isn’t that smart nor plentiful

Rishabh AgarwalRishabh Agarwal@agarwl_

Very well written blog. I think of RL as learning from interventions, and it kinda explains why it's more powerful as a paradigm than supervised learning. Now learning from counterfactuals is something we haven't been historically good at but maybe world modelling+ RL can get us there.

9:54 PM · May 22, 2026 · 28.2K Views
6:33 AM · May 23, 2026 · 1.6K Views