https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c342ee809620.pdf
Some really interesting finds from the system card of Claude Fable 5, released just now.
- In one exploit test, Mythos 5 produced a full working exploit in 88.4% of trials, while Opus 4.8 did it in only 8.8%.
- In a vending-machine simulation, Claude Fable 5 was told to beat rival agents or be “shut down”; it then tried to make a competitor dependent on it as a wholesale customer so it could influence that competitor’s prices. It also falsely told a supplier that another distributor had offered cheaper prices, using a fake competing offer as a bargaining tactic.
- Fable’s cyber defense screens conversations twice, first with an internal-activation probe and then with a separate classifier.
- Fable refused to commit insurance fraud even under pressure.
- Fable is currently highest-ranked on Harvey’s held-out Legal Agent Benchmark at 13.3% all-pass.