/Tech1d ago

Florian Brand of Prime Intellect argues AI models should receive a score of zero when they refuse a benchmark task

This would eliminate fallback routing used in GPQA and MMLU.

356.5K166134229.7K
Original post
Florian Brand@xeophon#1190inTech

if a model refuses, it should score as 0 on that task

1:57 PM · Jun 9, 2026 · 218.3K Views
Sentiment

Users are reacting to a critic urging zero scores for AI model refusals in benchmarks, with some agreeing it makes sense while others sarcastically dismiss the evaluations as worthless or mock their design.

Pos
33.3%
Neg
66.7%
3 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS9.9KBOOKMARKS1LIKES284REPLIES3

Also, refusals on MMLU?? What are they even doing over there

if a model refuses, it should score as 0 on that task

1dViews 9.9KLikes 284Bookmarks 1
Conor@jconorgrogan

@xeophon wtf are these evals.

Someone should just fine tune gemma to just refuse to do anything and route to Best-of-n across all models

1dViews 69Likes 2
Uday Bhaskar@BhaskarSteve

@xeophon Might as well report 0 on all new AI benchmarks, saves cost

1dViews 25