/AI7h ago

US Frontier AI Models Lead Chinese OSS Despite Flawed Evals

2311411.3K
Original postGarry Tan#266

i agree that the frontier american models are clearly better, but it doesn't help that the evals being used are such bs that the compelling way to actually assess as a user is to just try em and decide based on vibes.

eg many of these evals put opus 4.7 and 4.8 *way* higher than 4.6 which is nonsense to anyone that has used them.

pair that with the reality that most people just aren't yet using these for anything all that sophisticated (even among the power-ish users) and it makes sense that the chinese OSS models seem compelling.

Dean W. Ball@deanwball

You’d be shocked by how many people in think tanks/academia/government/“strategic classes,” including in the U.S., are convinced that Chinese models are now “good enough” and leading the world in adoption. Meanwhile, the reality I see is a fairly wide, and still widening, gap.

7:44 PM · Jun 7, 2026 · 11.3K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
No ranked X posts are available for this story yet.