11h ago

Epoch AI finds open-weight models lag proprietary state-of-the-art AI systems by about four months

The performance gap has remained steady since January 2023.

0
Original post

We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.

1:01 PM · May 29, 2026 View on X

I am not sure I have seen a good analysis of how much distillation reduces this gap - people have very different views on this, but they are rarely justified quantitatively (unless I missed something)

Not a comment on Epoch's thing, just a general one

Epoch AIEpoch AI@EpochAIResearch

We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.

8:01 PM · May 29, 2026 · 68.4K Views
12:39 AM · May 30, 2026 · 2.9K Views

is having a four month lead a sustainable multitrillion dollar business model?

Epoch AIEpoch AI@EpochAIResearch

We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.

8:01 PM · May 29, 2026 · 68.4K Views
3:57 AM · May 30, 2026 · 5.5K Views

The open model story is largely Chinese now, but the point of closest convergence on this graph is around L3-405B. And honestly, I think that's wrong. No, it was not on par with Sonnet 3.5. You could do *things* with 3.5 that open models took maybe 6 more months to reach.

Epoch AIEpoch AI@EpochAIResearch

We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.

8:01 PM · May 29, 2026 · 68.4K Views
11:05 PM · May 29, 2026 · 6.2K Views

actually I'm wrong, it's Qwen 2.5-72B (depending on how you count closeness). The same objection applies though. It was not remotely Sonnet-tier. V3 wasn't Sonnet-tier either. R1 was a whole different beast. Maybe only V3-0324 was the first to really qualify.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

The open model story is largely Chinese now, but the point of closest convergence on this graph is around L3-405B. And honestly, I think that's wrong. No, it was not on par with Sonnet 3.5. You could do *things* with 3.5 that open models took maybe 6 more months to reach.

11:05 PM · May 29, 2026 · 6.2K Views
11:08 PM · May 29, 2026 · 1.6K Views

I wonder how behind they would be if they didnt distill / benefit from the proprietary models

Epoch AIEpoch AI@EpochAIResearch

We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.

8:01 PM · May 29, 2026 · 68.4K Views
10:05 PM · May 29, 2026 · 5.2K Views

According to research by EpochAI, open-weight models lag behind frontier closed-source models by four months.

Four months. That's very little. And impressive at the same time.

Epoch AIEpoch AI@EpochAIResearch

We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.

8:01 PM · May 29, 2026 · 68.4K Views
8:19 PM · May 29, 2026 · 16.1K Views
Epoch AI finds open-weight models lag proprietary state-of-the-art AI systems by about four months · Digg