Epoch AI finds open-weight models lag proprietary state-of-the-art AI systems by about four months
The performance gap has remained steady since January 2023.
I am not sure I have seen a good analysis of how much distillation reduces this gap - people have very different views on this, but they are rarely justified quantitatively (unless I missed something)
Not a comment on Epoch's thing, just a general one
We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.
is having a four month lead a sustainable multitrillion dollar business model?
We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.
The open model story is largely Chinese now, but the point of closest convergence on this graph is around L3-405B. And honestly, I think that's wrong. No, it was not on par with Sonnet 3.5. You could do *things* with 3.5 that open models took maybe 6 more months to reach.

We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.
actually I'm wrong, it's Qwen 2.5-72B (depending on how you count closeness). The same objection applies though. It was not remotely Sonnet-tier. V3 wasn't Sonnet-tier either. R1 was a whole different beast. Maybe only V3-0324 was the first to really qualify.

The open model story is largely Chinese now, but the point of closest convergence on this graph is around L3-405B. And honestly, I think that's wrong. No, it was not on par with Sonnet 3.5. You could do *things* with 3.5 that open models took maybe 6 more months to reach.
I wonder how behind they would be if they didnt distill / benefit from the proprietary models
We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.
According to research by EpochAI, open-weight models lag behind frontier closed-source models by four months.
Four months. That's very little. And impressive at the same time.
We took another look at the capability gap between open-weight and proprietary models. Since the start of the year, open-weight models have lagged the state of the art by four months.