LLMs Surpass Humans At Next-Word Prediction But Fail To Unify Physics
But are they good *enough*? Are they learning the generalizable skills (and how to chain them) well enough that they'll soon be able to outstrip us all? I don't know!
LLMs don't seem especially good at learning the deep, generalizable skills. It takes them 500 million watts and most of the human text ever written and a ton of extra training on hard problems to get to the point where human mathematicians go "huh, idk anymore!"
LLMs don't seem especially good at learning the deep, generalizable skills. It takes them 500 million watts and most of the human text ever written and a ton of extra training on hard problems to get to the point where human mathematicians go "huh, idk anymore!"
And you can perhaps see how taking an AI and having it produce lots of text about how to solve hard problems and then tuning it in whatever directions happen to work, could tune the AI to get better at learning deep skills & how to compose them.
One thing I know is that people who have confidently predicted that LLMs would "hit a wall" have been wrong again and again and again over the last few years (e.g., here's LeCun being wrong by about 4,996 GPT generations: https://www.youtube.com/watch?t=3474&v=SGzMElJ11Cc&feature=youtu.be)
But are they good *enough*? Are they learning the generalizable skills (and how to chain them) well enough that they'll soon be able to outstrip us all? I don't know!
Even if LLMs can't go all the way: architectures change. If LLMs can't do the job & if the race continues, people will find a new architecture that *can* learn the deeper patterns. We know it's possible, because brains are an existence proof.
You can't look only at what's been achieved so far. You could've stared at mammal brains for eons and concluded they can't support "true engineering." ("Closest they'll ever get is beaver dams.") Then you'd be blindsided by a line of primates that learned just a little deeper.
"They're just trained on human data" is not the reason LLMs are still nondangerous. "They're just predictors" is both (a) false and (b) not the reason LLMs are still nondangerous.
Even if LLMs can't go all the way: architectures change. If LLMs can't do the job & if the race continues, people will find a new architecture that *can* learn the deeper patterns. We know it's possible, because brains are an existence proof.
That LLMs are still passively safe is a *fragile* fact about their architecture and compute limitations, not a fundamental fact about what happens when you train on human data.
"They're just trained on human data" is not the reason LLMs are still nondangerous. "They're just predictors" is both (a) false and (b) not the reason LLMs are still nondangerous.