/Tech5h ago

Researcher Flags Rise of Self-Made AI Benchmarks in Training and Evals

857145.9K
Original post
rohan anil@_arohan_#86inTech

One shall not make a benchmark with their own rules and compete in the same benchmark is a reasonable rule of thumb.

But this is common now across many topics, from training, inference, evals.

4:12 PM · Jun 10, 2026 · 4K Views
Sentiment

Users in the replies dismissed the rise of self-made AI benchmarks as overly ambitious since researchers now commonly tune their own metrics.

Pos
0.0%
Neg
100.0%
1 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS1.9KBOOKMARKS1LIKES4
rohan anil@_arohan_

I guess its a form of circular economy

rohan anil@_arohan_

One shall not make a benchmark with their own rules and compete in the same benchmark is a reasonable rule of thumb.

But this is common now across many topics, from training, inference, evals.

5hViews 1.9KLikes 4Bookmarks 1
REPLIES1
rohan anil@_arohan_

@init_malachi Because you are competing with yourself

5hViews 39Likes 1
M@init_malachi

@_arohan_ in real life you kinda have to tho

5hViews 45
M@init_malachi

@_arohan_ never known different tbh

5hViews 15
Ω.KendrickPlumard@fouriergalois

@_arohan_ let’s get back to the vague posting though. when will we see the loss curve for the 4th order optimizer

5hViews 12
Invincible@InvincibleEdge

@_arohan_ the difference between a benchmark and a flex is fading fast

5h
Rugbist@rugbist_

@_arohan_ bit ambitious when everyone just tunes their own tape now

5h
Blissy@BlissyOnX

@_arohan_ not wrong but the line between "tuning the rules" and "just knowing your own setup well" gets real thin

5h