/Tech1d ago

Florian Brand of Prime Intellect argues AI models should receive a score of zero when they refuse a benchmark task

This would eliminate fallback routing used in GPQA and MMLU.

356.5K166134229.7K

#380

Original post

Florian Brand@xeophon#1190inTech

if a model refuses, it should score as 0 on that task

1:57 PM · Jun 9, 2026 · 218.3K Views

/Tech1d ago

Florian Brand of Prime Intellect argues AI models should receive a score of zero when they refuse a benchmark task

This would eliminate fallback routing used in GPQA and MMLU.

356.5K166134229.7K

#380

Original post

Florian Brand@xeophon#1190inTech

if a model refuses, it should score as 0 on that task

1:57 PM · Jun 9, 2026 · 218.3K Views

Sentiment

Users are reacting to a critic urging zero scores for AI model refusals in benchmarks, with some agreeing it makes sense while others sarcastically dismiss the evaluations as worthless or mock their design.

Pos

33.3%

Neg

66.7%

3 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS9.9KBOOKMARKS1LIKES284REPLIES3

Florian Brand@xeophon

Also, refusals on MMLU?? What are they even doing over there

Florian Brand@xeophon

if a model refuses, it should score as 0 on that task

1d9.9K2841

RETWEETS1

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr

@xeophon 100%

Florian Brand@xeophon

if a model refuses, it should score as 0 on that task

1d1.5K110

Conor@jconorgrogan

@xeophon wtf are these evals.

Someone should just fine tune gemma to just refuse to do anything and route to Best-of-n across all models

1d692

Uday Bhaskar@BhaskarSteve

@xeophon Might as well report 0 on all new AI benchmarks, saves cost

1d25

Florian Brand@xeophon

@jconorgrogan tempting…

1d12