Tal Linzen, associate professor at NYU and research scientist at Google, discusses on the Information Bottleneck podcast why children acquire language from roughly 100 million words while large language models require trillions of tokens
Stronger next-word prediction in models reduces human processing alignment.
This was a super fun conversation, thanks for having me on the podcast!
New episode of The Information Bottleneck is out with @tallinzen, Associate Professor at NYU and Research Scientist at Google. Tal works at the intersection of cognitive science and language models, and he's one of the clearest voices on what humans and LLMs can actually teach us about each other. We talked about why children learn language from 100M words while LLMs need trillions, the surprising finding that as models get better at predicting the next word they become worse models of humans, inductive biases and synthetic languages, world models and whether transformers actually use them, BabyLM, and how AI coding tools are changing the way he teaches at NYU. I'm sure you will enjoy it!
The episode - https://www.the-information-bottleneck.com/language-cognition-and-the-limits-of-llms/
New episode of The Information Bottleneck is out with @tallinzen, Associate Professor at NYU and Research Scientist at Google. Tal works at the intersection of cognitive science and language models, and he's one of the clearest voices on what humans and LLMs can actually teach us about each other. We talked about why children learn language from 100M words while LLMs need trillions, the surprising finding that as models get better at predicting the next word they become worse models of humans, inductive biases and synthetic languages, world models and whether transformers actually use them, BabyLM, and how AI coding tools are changing the way he teaches at NYU. I'm sure you will enjoy it!