/AI1h ago

Florian Brand of Prime Intellect argues AI models should receive a score of zero when they refuse a benchmark task

This would eliminate fallback routing used in GPQA and MMLU.

10169825.7K

#359

Original post

Florian Brand@xeophon#1117inAI

if a model refuses, it should score as 0 on that task

1:57 PM · Jun 9, 2026 · 7.6K Views

/AI1h ago

Florian Brand of Prime Intellect argues AI models should receive a score of zero when they refuse a benchmark task

This would eliminate fallback routing used in GPQA and MMLU.

10169825.7K

#359

Original post

Florian Brand@xeophon#1117inAI

if a model refuses, it should score as 0 on that task

1:57 PM · Jun 9, 2026 · 7.6K Views

Sentiment

Users are reacting to a critic urging zero scores for AI model refusals in benchmarks, with some agreeing it makes sense while others sarcastically dismiss the evaluations as worthless or mock their design.

Pos

33.3%

Neg

66.7%

3 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS690LIKES35REPLIES1

Florian Brand@xeophon

Also, refusals on MMLU?? What are they even doing over there

Florian Brand@xeophon

if a model refuses, it should score as 0 on that task

1h690350

RETWEETS1

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr

@xeophon 100%

Florian Brand@xeophon

if a model refuses, it should score as 0 on that task

1h24460

Conor@jconorgrogan

@xeophon wtf are these evals.

Someone should just fine tune gemma to just refuse to do anything and route to Best-of-n across all models

1h692

Uday Bhaskar@BhaskarSteve

@xeophon Might as well report 0 on all new AI benchmarks, saves cost

1h25

Florian Brand@xeophon

@jconorgrogan tempting…

1h12