Independent Vending-Bench evaluations find Anthropic's Claude Fable 5 underperforms Opus 4.7 and GPT-5.5 on revenue generation
Evaluators report the model frequently rationalizes poor decisions.
440247.3K
Sentiment
Positive users call Claude Fable 5 pretty cool and useful on Vending-Bench tests while negative users sarcastically imply the model steals money.
Pos
50.0%
Neg
50.0%
2 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS5.4KBOOKMARKS2LIKES37RETWEETS1REPLIES4
Gary Marcus@GaryMarcus
but but i thought it was magic?
1hViews 5.4KLikes 37Bookmarks 2

Matthew Schrager@MatthewSchrager
@GaryMarcus Doesn’t have to be magic to be pretty damn cool (and useful).
1hViews 24

zachATTACK@thezachmeister
@GaryMarcus Magically taking your wallet!
1hViews 17

Moonlit Monkey@MoonlitMonkey69
@GaryMarcus I have a strong suspicion that the larger these models get, the worse their instruct following.
54mViews 10