Anthropic launches Claude Opus 4.8, claiming top performance on AttuneBench while trailing Claude 5.5 on coding

VIEWS13.7MBOOKMARKS7.9KLIKES65.6KRETWEETS8.5KREPLIES3.4K

Claude@claudeai

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.

Available today at the same price.

32d13.7M65.6K7.9K

Boris Cherny@bcherny

Claude Opus 4.8 is out today. It's our strongest coding model yet: up on SWE-bench Pro (from 64.3 to 69.2) and noticeably more honest about its own work. It tells you when it's unsure and catches its own bugs instead of declaring victory early. Same price as 4.7.

Claude@claudeai

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.

Available today at the same price.

32d311.8K5.5K407

Thariq@trq212

I think you’ll really like Opus 4.8

It’s as smart as its benchmarks show but expresses and utilizes that intelligence in a warm and collaborative way.

Workflows are a great way to utilize it- I’m hooked. Article on that soon.

Claude@claudeai

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.

Available today at the same price.

32d178.8K2.5K201

Theo - t3.gg@theo

wtf this is real?? they really are leaning into the whole slot machine thing huh

Evan Bacon 🥓@Baconbrix

bro what

32d178.4K2.1K207

Alex Albert@alexalbert__

Excited to release Opus 4.8 today! We heard your feedback on 4.7 and have made many fixes for 4.8.

4.8 understands nuances better, feels much more natural to talk to, and is overall a stronger collaborator on everything from coding to knowledge work.

Claude@claudeai

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.

Available today at the same price.

32d179.2K2.1K149

swyx @aiDotEngineer WF Day 1@swyx

"Developers can update Claude’s instructions mid-task without breaking the prompt cache or routing the update through a user turn"

wtf? how??

Claude@claudeai

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.

Available today at the same price.

32d133.6K833325

Elon Musk@elonmusk

@claudeai @farzyness Nice work

32d62.7K2.4K36

Chubby♨️@kimmonismus

Huge!! „Mythos class model to all customers in the coming weeks“!!

Holy, we accelerate!!

Chubby♨️@kimmonismus

Thank god! I can turn off adaptive thinking and set reasoning effort myself.

Finally!

32d59.7K1.1K99

Aaron Levie@levie

Opus 4.8 is out, and we've been testing it with the Box AI agent on our most complex real-world knowledge worker tasks with enterprise documents.

Opus 4.8 is measurably better at the generative and analytical work enterprises care about most like writing reports, synthesizing data, reviewing complex enterprise documents across a range of industries. Here are some quick examples of wins vs. Opus 4.7:

* Report drafting: Opus 4.8 outperforms on a majority of report drafting tasks, producing more complete and accurate analytical reports. On an industrial goods reporting task, it scored 87% vs 77% for Opus 4.7; on a consumer products launch evaluation, 90% vs 84%.

* Review and verification: On a legal NDA review task requiring verification of contract terms against compliance criteria, Opus 4.8 catches more relevant clauses and flags more potential issues, with near-perfect consistency across all trials.

* Financial data analysis: On a corporate lending analysis task comparing syndicated vs bilateral loan structures, Opus 4.8 extracts more accurate financial metrics from source documents, leading by nearly 8 percentage points.

* Consumer products launch evaluation: On a task requiring assessment of a product launch across multiple performance dimensions, Opus 4.8 captured evaluation criteria that Opus 4.7 missed — producing a more thorough aBnalysis that covered all required factors rather than just the most obvious ones.

* Legal NDA review: On a task verifying NDA terms against compliance criteria, Opus 4.8 identified more relevant clauses and flagged potential issues that Opus 4.7 missed. Its outputs were also highly predictable — producing nearly identical quality across independent runs.

* Public sector grant analysis: When analyzing library grant documentation against eligibility criteria, Opus 4.8 correctly extracted and validated nearly all required data points, catching specific eligibility details that Opus 4.7 overlooked or misinterpreted.

Opus 4.8 will be rolling out shortly to Box customers to deploy in Box AI agents. Learn more here: https://blog.box.com/anthropics-opus-48-advances-enterprise-content-use-cases

Claude@claudeai

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.

Available today at the same price.

32d89.5K427170

Claude@claudeai

Also new in Claude Code: dynamic workflows (research preview).

For the hardest tasks, Claude makes a plan, runs hundreds of parallel subagents, and verifies its work before reporting back. Think a migration touching hundreds of files.

Read more: http://claude.com/blog/introducing-dynamic-workflows-in-claude-code

32d69.9K687131

Claude@claudeai

Fast mode is available for Opus 4.8. It's the same model at roughly 2.5x the speed, and we've made it three times cheaper than before.

Turn it on with /fast in Claude Code. On the API, contact your account manager to request access or join the waitlist: http://claude.com/fast-mode

32d76.6K1.1K66

Claude@claudeai

In Claude Code, Opus 4.8 makes calls like an experienced engineer without needing constant check-ins.

It stays on track across long-running sessions and follows work through in your repo, so you can hand off a feature or a bug sweep while you focus on what's next.

32d61.8K95689

Boris Cherny@bcherny

4.8 defaults to high effort, which spends about the same tokens as 4.7's default on coding but performs better.

For hard problems and long-running async work, switch to xhigh. We've raised Claude Code rate limits to cover the extra tokens.

Boris Cherny@bcherny

Claude Opus 4.8 is out today. It's our strongest coding model yet: up on SWE-bench Pro (from 64.3 to 69.2) and noticeably more honest about its own work. It tells you when it's unsure and catches its own bugs instead of declaring victory early. Same price as 4.7.

32d35.5K77734

elie@eliebakouch

opus 4.8 benchmarks vs mythos and previous generation

graphwalks (long context), USAMO (math) are the biggest improvements. vending bench score is insanely bad

Claude@claudeai

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.

Available today at the same price.

32d37.5K28387

cat@_catwu

We just shipped Opus 4.8! It's noticeably more honest, owning what it doesn't know and flagging problems in its own code instead of glossing over them. It's our recommended model for daily use in Claude Code.

ClaudeDevs@ClaudeDevs

Opus 4.8 is live in Claude Code today.

A few things worth knowing: 🧵

32d32.6K55632

Chubby♨️@kimmonismus

Thank god! I can turn off adaptive thinking and set reasoning effort myself.

Finally!

Chubby♨️@kimmonismus

Opus 4.8 is live! Even in Germany!!

32d99.7K59836

Boris Cherny@bcherny

We also shipped dynamic workflows in Claude Code (research preview), for tasks too big for one pass. Make sure to default to auto mode so Claude isn't stopping for permissions.

It's token-intensive, so save it for your biggest jobs: migrations, refactors, perf optimization, batch bug fixes.

Boris Cherny@bcherny

4.8 defaults to high effort, which spends about the same tokens as 4.7's default on coding but performs better.

For hard problems and long-running async work, switch to xhigh. We've raised Claude Code rate limits to cover the extra tokens.

32d36.5K42456

Andrew Curran@AndrewCurran_

Opus 4.8 is live for me right now. Anthropic's release window is now 42 days.

Andrew Curran@AndrewCurran_

Many rumors swirl, it appears we are getting at least one release this morning after all. Usually, there will be an accompanying storm of industry news that arrives ahead of it. If so, I will put it all in one thread rather than spam people.

32d30.5K54333

Chubby♨️@kimmonismus

Opus 4.8 is live. Benchmarks especially significant jump in Agentic coding, but more important:

„Fast mode is available for Opus 4.8. It's the same model at roughly 2.5x the speed, and we've made it three times cheaper than before.“

Claude@claudeai

Introducing Claude Opus 4.8: it builds on Opus 4.7 with sharper judgment, more honesty about its own progress, and the ability to work independently for longer than its predecessors.

Available today at the same price.

32d40.2K51838

Chubby♨️@kimmonismus

„4.8 understands nuances better, feels much more natural to talk to, and is overall a stronger collaborator on everything from coding to knowledge work.“

So big. Is 4.8 being our good old friend 4.6 just better??

Testing time

Alex Albert@alexalbert__

Excited to release Opus 4.8 today! We heard your feedback on 4.7 and have made many fixes for 4.8.

4.8 understands nuances better, feels much more natural to talk to, and is overall a stronger collaborator on everything from coding to knowledge work.

32d33.9K53938