/AI1h ago

Florian Brand of Prime Intellect argues AI models should receive a score of zero when they refuse a benchmark task

This would eliminate fallback routing used in GPQA and MMLU.

10169825.7K
Original post
Florian Brand@xeophon#1117inAI

if a model refuses, it should score as 0 on that task

1:57 PM · Jun 9, 2026 · 7.6K Views
Sentiment

Users are reacting to a critic urging zero scores for AI model refusals in benchmarks, with some agreeing it makes sense while others sarcastically dismiss the evaluations as worthless or mock their design.

Pos
33.3%
Neg
66.7%
3 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS690LIKES35REPLIES1

Also, refusals on MMLU?? What are they even doing over there

if a model refuses, it should score as 0 on that task

1hViews 690Likes 35Bookmarks 0
Conor@jconorgrogan

@xeophon wtf are these evals.

Someone should just fine tune gemma to just refuse to do anything and route to Best-of-n across all models

1hViews 69Likes 2
Uday Bhaskar@BhaskarSteve

@xeophon Might as well report 0 on all new AI benchmarks, saves cost

1hViews 25