/Tech3h ago

Expert Questions if New Architectures Can Slash LLM Training Costs 100x

39011.3K

Original post

Amir-massoud Farahmand@SoloGen#1521inTech

How much of the cost of training LLMs (and alike) is tied to Transformers and its variants? Is there any reason to believe/expect that we can have an architecture that is 2-3 orders of magnitude cheaper with a similar behaviour? Or is there any fundamental limit?

5:12 PM · Jun 10, 2026 · 792 Views

/Tech3h ago

Expert Questions if New Architectures Can Slash LLM Training Costs 100x

39011.3K

#1521

Original post

Amir-massoud Farahmand@SoloGen#1521inTech

5:12 PM · Jun 10, 2026 · 792 Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS351LIKES3REPLIES1

Amir-massoud Farahmand@SoloGen

I am not talking about sample efficiency or new capabilities. Just the compute cost.

Of course, the cost depends on the hardware. The question can be relaxed: If we are allowed to change the hardware minimally (*), can we come up with a much cheaper architecture?

Amir-massoud Farahmand@SoloGen

3h35130

Amir-massoud Farahmand@SoloGen

(*) By minimally, I mean something that can be designed and mass-produced by the current chip makers.

P.S: I am not following the architecture design efforts, so this question might have a simple answer. I don't want to ask ChatGPT either, at least yet.

Amir-massoud Farahmand@SoloGen

I am not talking about sample efficiency or new capabilities. Just the compute cost.

Of course, the cost depends on the hardware. The question can be relaxed: If we are allowed to change the hardware minimally (*), can we come up with a much cheaper architecture?

3h22120