/Tech15h ago

DeepMind Researcher Predicts Networks Of Neural Networks Over Pure Scaling

10141119767.9K
Original post

Fwiw - I understand that this is the concensus view, but I think history will look back with surprise that it didn't bear out in the end.

In the 1960s, an employee at IBM or Bell Labs would have said the same thing about the Mainframe computer... and they were incredible (and many are still in use today).

But it wasn't just "bigger mainframes forever" anymore than the library was just "bigger library of alexandria forever".

I say that as someone who joined DeepMind nearly 10 years ago doing language modeling research. I have had access to large scale and small scale compute during that time.

I personally think there's an enormous amount of low-hanging fruit which doesn't require.

The future is networks of neural networks: - better routers - better benchmarks - better access to non-public / niche information - better pricing mechanisms - better source attribution - better unlearning - ...

There's so much great research to be done. And much of it remains low-hanging because there are some subtle reasons why highly resourced orgs don't tend to pursue them.

Aidan Clark@_aidan_clark_

If you want to work on pretraining-for-AGI, join OpenAI, Google, Meta or the Anthropic/XAI/Cursor supergroup.

The bitter truth of the widening compute gap is that all the problems which are actually on the critical path to AGI now demand that level of compute.

9:10 PM · Jun 9, 2026 · 67.9K Views
Sentiment

Positive users are optimistic that networks of specialized neural models will deliver higher accuracy and easier safety, while negative users dismiss pretrain labs as outdated and demand personal AGI on cellphones.

Pos
33.3%
Neg
66.7%
6 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS1.1K

If @peterthiel asked me "what does everyone believe that I know to be wrong"... this would be my answer.

(And Aidan is very smart and he's describing a view for which there is plenty of evidence, and near universal agreement. I just disagree.).

1dViews 1.1KLikes 7
BOOKMARKS9LIKES8

If none of this makes sense and you just need to read "a completely different way to think about AI progress" and a bunch of launch points for research in AI, a few links: - https://attribution-based-control.ai/ - https://github.com/iamtrask/abcGPT - https://openmined.org/blog/what-is-broad-listening/ - - https://openmined.org/blog/secure-enclaves-for-ai-evaluation/

1dViews 1.1KLikes 8Bookmarks 9
RETWEETS10

Fwiw - I understand that this is the concensus view, but I think history will look back with surprise that it didn't bear out in the end.

In the 1960s, an employee at IBM or Bell Labs would have said the same thing about the Mainframe computer... and they were incredible (and many are still in use today).

But it wasn't just "bigger mainframes forever" anymore than the library was just "bigger library of alexandria forever".

I say that as someone who joined DeepMind nearly 10 years ago doing language modeling research. I have had access to large scale and small scale compute during that time.

I personally think there's an enormous amount of low-hanging fruit which doesn't require.

The future is networks of neural networks: - better routers - better benchmarks - better access to non-public / niche information - better pricing mechanisms - better source attribution - better unlearning - ...

There's so much great research to be done. And much of it remains low-hanging because there are some subtle reasons why highly resourced orgs don't tend to pursue them.

Aidan Clark@_aidan_clark_

If you want to work on pretraining-for-AGI, join OpenAI, Google, Meta or the Anthropic/XAI/Cursor supergroup.

The bitter truth of the widening compute gap is that all the problems which are actually on the critical path to AGI now demand that level of compute.

1dViews 67.9KLikes 141Bookmarks 97
REPLIES1
broadfield-dev@broadfield_dev

@iamtrask wouldn't. Those companies are at least 12 months behind SOTA. Code is finite, predictable, and comes with error messages. They will never build anything but great copy/paste databases.

They are just big.

1dViews 124

@peterthiel To be more specific. A global network of highly interconnected, neural-network routed, small, specialized models will ultimately deliver: - higher accuracy - faster speed - lower cost

than large, monolithic systems.

1dViews 661Likes 6Bookmarks 2

One final thing - I think the biggest barrier to breakthrough research is allowing yourself to subscribe to industry groupthink (or to the polar opposite of that groupthink).

Go in a 3rd direction. Follow the scaling laws. Look for bridges across fields (especially deep learning, cryptography, and distributed systems).

It's never been a better time to do research.

1dViews 860Likes 8Bookmarks 1

@peterthiel They'll also be easier to make safe in many ways, but now I'm going on a tangent. This is enough for now.

1dViews 332Likes 6
Joel Kreager@JoelKreager

@iamtrask Can AI really break reality any better than QAnon did? We can't agree on the most basic facts already.

15hViews 48Likes 1
dragonAI@AIMLforEdu

@iamtrask What does it mean by networks of neural networks? 🤔

19hViews 39Likes 1

@iamtrask so he thinks the pretrain-on-mainframes crowd is just slow to see whats obvious

name one frontier lab where the bitter truth is not hiring

1dViews 97

@iamtrask I'm doing work on my macbook pro, I'm on CPU now. But I do use the models to write code and if model developers nerf them that will make work more difficult I guess. The big issue as ever is money. There are massive IPOs coming and owners want lots of money.

21hViews 92
broadfield-dev@broadfield_dev

@iamtrask Personal AGI on your cellphone or gtfo.

1dViews 7
Rugbist@rugbist_

@iamtrask mainframes were right for their era. the question is if the current bet is that wrong or just mispriced

1d
Invincible@InvincibleEdge

@iamtrask history judges slower than the earnings calls do

think those bell labs guys would recognize the same vibe in the air now

1d
Blissy@BlissyOnX

@iamtrask so who is the rogue 2020s IBM splitting off to build the minicomputer this time

1d