/AI4h ago

Claude Opus-4.8 Hits 42/99 on Prinzbench Using New Max Setting

91087216.1K

Original posts

Reposts

#420

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)#420

prinz@deredleritt3r

Added to prinzbench: Opus-4.8.

For the very first time, the Max setting was available to me in the Claude app when I used this model. Using this setting, Claude's performance improved dramatically vs. all prior Anthropic models. Opus-4.8 (Max) scored 42/99 on prinzbench, as compared to 25/99 for Opus 4.7 (Extended).

This was the second-highest score of all tested models to date for a model: (i) not released by OpenAI, and (ii) not utilizing a multi-agent setup or parallelized compute. (Gemini 3.1 Pro is still the best such model, having scored 50/99.)

I am now very curious about how the "Mythos-class models" that Anthropic has promised to release in the near future will perform on my benchmark.

9:18 PM · Jun 2, 2026 · 6.1K Views

/AI4h ago

Claude Opus-4.8 Hits 42/99 on Prinzbench Using New Max Setting

--0--

Original posts

Reposts

#420

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)#420

prinz@deredleritt3r

Added to prinzbench: Opus-4.8.

I am now very curious about how the "Mythos-class models" that Anthropic has promised to release in the near future will perform on my benchmark.

9:18 PM · Jun 2, 2026 · 6.1K Views

Sentiment

Users criticized Claude Opus-4.8 for proving frustrating and weak on practical legal documents and search tasks compared to other frontier models.

Pos

0.0%

Neg

100.0%

2 comments with sentiment.

Cluster Engagement

Sentiment

Sentiment building, check back later.

Cluster Engagement

Views

Comments

Reposts

Bookmarks

Expand data

Posts from X

Most Activity

No ranked X posts are available for this story yet.