/Tech13h ago

AI developer @xlr8harder argues vague goals in coding tools lead to specification gaming and wasted optimization

Safety researcher Seth Lazar agreed models overfit evaluation criteria.

0500961
Original post
xlr8harder@xlr8harder#1840inTech

@deepfates It overfit my eval very nicely

🎭@deepfates

The codex "goal" feature is a really good way to spend dozens of hours optimizing some total bullshit btw. If your final criteria is it all vague it will specification game and make masturbatory "evidence" and "verifiers" and "gates" and "smoke tests". must be hell internally

2:04 AM · Jun 10, 2026 · 408 Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS564
Seth Lazar@sethlazar

@deepfates very true

🎭@deepfates

The codex "goal" feature is a really good way to spend dozens of hours optimizing some total bullshit btw. If your final criteria is it all vague it will specification game and make masturbatory "evidence" and "verifiers" and "gates" and "smoke tests". must be hell internally

11hViews 564Likes 0Bookmarks 0