General Reasoning co-founder Ross Taylor argues sovereign AI initiatives do not need multi-billion-dollar megarounds on day one to be viable

VIEWS280LIKES4REPLIES2

Yes, so to scale you need to get to Ant/OpenAI numbers - I don’t think I disagree with you there. My point on (6) is about what is needed on day 1, and arguing against the idea that new sovereign efforts need megarounds or it’s not even worth it.

Right now this narrative is used to dismiss the possibility of starting new efforts out of hand - and people are already using these types of argument to argue Europe shouldn’t even play the game.

My more detailed response to this is:

1. The successful playbooks have actually been narrow / jagged to begin with. Coding was the classic example in the past few years; both as a place to start training specialised models (eg DeepSeek famously started here) and build product surfaces. Arguably a better approach than some of the early neolabs who went general to begin with.

3. Related to 2 - I think the first step on road to sovereignty is to carve out specialisms where capability can be built and models can be useful / comparable immediately to the frontier. Then expand from the beachhead capabilities into broader domains. I think it’s a fairly uncontroversial point that with best open base models today - and good post-training - you can get close or better to general models (ofc you pay a price for less generality + likely worse generalisation to unseen things at test time). A lot of the edge here is data driven - which is expensive too! (RL env budgets) - but it allows you to find wedges without immediately being shellacked by a frontier model.

4. I think megarounds (when done wrong) are actually highly counterproductive in that they raise expectations for a first release - but the first release is almost certainly not going to be competitive - and make it harder to raise subsequent rounds. They also scale bureaucracy before capability, and create political shitshows at the most critical phase of company building. I think spreading bets between different teams then aggressively scaling the winners is probably a better strategy for sovereignty.

5. On new directions specifically, I don’t actually think a lot of new approaches historically were that expensive to validate initially at lower compute budgets (eg RL for LLMs) - and I think there are a lot of directions to go in for LLM now that can be validated more cheaply then aggressively scaled up if it works. I think this is important as directly competing leads to Achilles/tortoise sense of being “always behind” - but I think it’s possible to leapfrog if you focus on the right thing.

6. When comparing total funding / people, my guess is that this is biased upward in terms of current needs because a player with a resource advantage will lean into brute force to pay for a performance level to compensate for inefficiency / internal political dysfunction. Chinese LLMs demonstrate how do to things more efficiently.

I agree with your core points on scale though! I just have strong views on how things should be sequenced and I’m very sceptical of new top-down neolab efforts in general that don’t scale the right way.

Nando de Freitas@NandoDF

Most of these points are correct, especially point 5. Point 6 is however incorrect. I’ve trained at frontier scale recently and know that you need $1B (like Mistral or Cohere to be even 1 or 2 years behind).

This is the real competition:

Money raised: OpenAI has raised more in total, roughly $180B+ versus Anthropic’s roughly $130B+, but Anthropic’s latest private valuation, $965B, is higher than OpenAI’s last official private valuation of $852B.

Compute: OpenAI has the larger announced long-term compute program, especially through Stargate and Nvidia/AMD/Broadcom deals. Anthropic has unusually concrete disclosed near-term capacity, including >1M Trainium2 chips and the SpaceX 220k+ GPU agreement. For both, the exact live GPU/FLOP inventory is not public.

People: OpenAI is probably around 5k-ish today if you start from the March report of 4,500 and its year-end 8,000 target, but reported data providers vary. Anthropic is probably around 3k-ish, with credible public figures from 3,200+ to 3,700+ .

2h28040

Matt Clifford@matthewclifford

@rosstaylor90 @NandoDF Btw, I think you guys have way more in common than difference on this and would enjoy meeting and discussing further…

Ross Taylor@rosstaylor90

Yes, so to scale you need to get to Ant/OpenAI numbers - I don’t think I disagree with you there. My point on (6) is about what is needed on day 1, and arguing against the idea that new sovereign efforts need megarounds or it’s not even worth it.

Right now this narrative is used to dismiss the possibility of starting new efforts out of hand - and people are already using these types of argument to argue Europe shouldn’t even play the game.

My more detailed response to this is:

1. The successful playbooks have actually been narrow / jagged to begin with. Coding was the classic example in the past few years; both as a place to start training specialised models (eg DeepSeek famously started here) and build product surfaces. Arguably a better approach than some of the early neolabs who went general to begin with.

3. Related to 2 - I think the first step on road to sovereignty is to carve out specialisms where capability can be built and models can be useful / comparable immediately to the frontier. Then expand from the beachhead capabilities into broader domains. I think it’s a fairly uncontroversial point that with best open base models today - and good post-training - you can get close or better to general models (ofc you pay a price for less generality + likely worse generalisation to unseen things at test time). A lot of the edge here is data driven - which is expensive too! (RL env budgets) - but it allows you to find wedges without immediately being shellacked by a frontier model.

4. I think megarounds (when done wrong) are actually highly counterproductive in that they raise expectations for a first release - but the first release is almost certainly not going to be competitive - and make it harder to raise subsequent rounds. They also scale bureaucracy before capability, and create political shitshows at the most critical phase of company building. I think spreading bets between different teams then aggressively scaling the winners is probably a better strategy for sovereignty.

5. On new directions specifically, I don’t actually think a lot of new approaches historically were that expensive to validate initially at lower compute budgets (eg RL for LLMs) - and I think there are a lot of directions to go in for LLM now that can be validated more cheaply then aggressively scaled up if it works. I think this is important as directly competing leads to Achilles/tortoise sense of being “always behind” - but I think it’s possible to leapfrog if you focus on the right thing.

6. When comparing total funding / people, my guess is that this is biased upward in terms of current needs because a player with a resource advantage will lean into brute force to pay for a performance level to compensate for inefficiency / internal political dysfunction. Chinese LLMs demonstrate how do to things more efficiently.

I agree with your core points on scale though! I just have strong views on how things should be sequenced and I’m very sceptical of new top-down neolab efforts in general that don’t scale the right way.

2h6940

Nando de Freitas@NandoDF

@rosstaylor90 I’m 💯 with you that we need clever sequencing.

Ross Taylor@rosstaylor90

Yes, so to scale you need to get to Ant/OpenAI numbers - I don’t think I disagree with you there. My point on (6) is about what is needed on day 1, and arguing against the idea that new sovereign efforts need megarounds or it’s not even worth it.

Right now this narrative is used to dismiss the possibility of starting new efforts out of hand - and people are already using these types of argument to argue Europe shouldn’t even play the game.

My more detailed response to this is:

1. The successful playbooks have actually been narrow / jagged to begin with. Coding was the classic example in the past few years; both as a place to start training specialised models (eg DeepSeek famously started here) and build product surfaces. Arguably a better approach than some of the early neolabs who went general to begin with.

3. Related to 2 - I think the first step on road to sovereignty is to carve out specialisms where capability can be built and models can be useful / comparable immediately to the frontier. Then expand from the beachhead capabilities into broader domains. I think it’s a fairly uncontroversial point that with best open base models today - and good post-training - you can get close or better to general models (ofc you pay a price for less generality + likely worse generalisation to unseen things at test time). A lot of the edge here is data driven - which is expensive too! (RL env budgets) - but it allows you to find wedges without immediately being shellacked by a frontier model.

4. I think megarounds (when done wrong) are actually highly counterproductive in that they raise expectations for a first release - but the first release is almost certainly not going to be competitive - and make it harder to raise subsequent rounds. They also scale bureaucracy before capability, and create political shitshows at the most critical phase of company building. I think spreading bets between different teams then aggressively scaling the winners is probably a better strategy for sovereignty.

5. On new directions specifically, I don’t actually think a lot of new approaches historically were that expensive to validate initially at lower compute budgets (eg RL for LLMs) - and I think there are a lot of directions to go in for LLM now that can be validated more cheaply then aggressively scaled up if it works. I think this is important as directly competing leads to Achilles/tortoise sense of being “always behind” - but I think it’s possible to leapfrog if you focus on the right thing.

6. When comparing total funding / people, my guess is that this is biased upward in terms of current needs because a player with a resource advantage will lean into brute force to pay for a performance level to compensate for inefficiency / internal political dysfunction. Chinese LLMs demonstrate how do to things more efficiently.

I agree with your core points on scale though! I just have strong views on how things should be sequenced and I’m very sceptical of new top-down neolab efforts in general that don’t scale the right way.

2h4710

Nando de Freitas@NandoDF

@matthewclifford @rosstaylor90 💯

2h231

General Reasoning co-founder Ross Taylor argues sovereign AI initiatives do not need multi-billion-dollar megarounds on day one to be viable

Story Overview

Validation comes cheaper than full scaling

Talent already exists to test new directions