Livetweet @mcxfrank talk: If we care about intelligence and cognition, AI has recently allowed a change; we now have two ways to study them. AI is of course allowing us a lot that we wouldn't dare on our Children (brain surgery, never tell a child about cats...) #acl #conll
Users are excited about AI enabling new methods to study intelligence and cognition because it supports amazing participatory data collection efforts like Wordbank.
No Digg Deeper questions have been answered for this story yet.
Most Activity
This also raises the differences between the two (hypotheses in pic). Why do models need so much more data for a similar result? Why can humans learn much more efficiently (better is debatable)? (Known as the @babyLMchallenge c.f. https://babylm.github.io/ if you're interested)
Livetweet @mcxfrank talk: If we care about intelligence and cognition, AI has recently allowed a change; we now have two ways to study them. AI is of course allowing us a lot that we wouldn't dare on our Children (brain surgery, never tell a child about cats...) #acl #conll
@mcxfrank (amazing) works also create huge participatory data collection efforts. Don't miss them. Wordbank gives you the words children around the world know (at 16-30mo). https://wordbank.stanford.edu/
Maybe child data is simply more useful? They checked on tinydialogues, childes and babyLM, for LLMs the data children hear might (?) be useful for children, but it is worse for LLMs. Frank mentions it is less diverse, I'll also add less challenging, motivation is a human issue.
This also raises the differences between the two (hypotheses in pic). Why do models need so much more data for a similar result? Why can humans learn much more efficiently (better is debatable)? (Known as the @babyLMchallenge c.f. https://babylm.github.io/ if you're interested)
So languages do not interfere. Well sure, but don't you expect to improve, knowledge/skills and anything nonlinguistic? The knowledge you learn helps across the languages. This questions bothers me lately, as you probably noted in my papers, social etc.
I wonder if some children accelerate faster, are there connections between starting time and acceleration. Do non noise outliers exist? We think of teaching more, but can we teach "learning" even at the cost of a slower start?(p.s. this is on words, but we've got to start somwe..
Sadly I missed part of the bilingual (I am that dedicated to sharing his talk here) but iiuc they show that learning also on another language doesn't reduce the ability to learn language.
Maybe children get their language in the right order? Well maybe, but for models it's worse than random (which as Michael said, by now there's compounding evidence, curriculum is hard to make any change, see his work, babyLM findings and the award winning negative results).
The results from that easily give you worldwide comparisons, yes girls learn more worlds early on, children learning accelerates etc.
A similar issue is in multimodal, adding a lot of images doesn't improve your "intelligence"(pic). You need very sophisticated training (that they create) to force the images to improve anything for the text.
Training on a large amount of data (not only children) sounds promising, but apparently, the mismatch between what you see and what is said is too great.
Databrary is the effort of recording view and sounds of babies. Collected by many so far they will soon reach the amount of data of a whole (wake-time) year! https://databrary.org/

