RL is a mistake, thinking is a mistake, and if we just put all the money into crafting an astronomically good, massive dataset, we'd pretrain a model that outperforms everything that exists by a considerable margin
source: my ass (I have no idea what I'm talking about)
People loved Fable 5 because it inferred intent better than any other model by a considerable margin. Sure, it was also the most capable model, but it's ability to just 'get' what you wanted, was unparalleled.














