/Tech6h ago

Academic Yoav Artzi jokes that AI research only qualifies as 'frontier' if it gets blocked by API providers

Students joked that unblocked work is out of distribution.

1499346.7K
Original post
Dimitris Papailiopoulos@DimitrisPapail#203inTech

Ok so now that the nerfing flag is visible we get to find out what is considered LLM frontier research?

7:29 AM · Jun 11, 2026 · 5.5K Views
Sentiment

Many users criticize the aggressive safety classifier for LLM frontier research as overly restrictive and worrisome for enabling activation steering, while a few see potential in logit evaluation techniques.

Pos
25.0%
Neg
75.0%
4 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS748BOOKMARKS2LIKES20REPLIES2
Yoav Artzi@yoavartzi

@DimitrisPapail That's what I told my students. If you don't get blocked, you are not frontier enough.

Their answer: if you don't get blocked you are so frontier you are OOD

Touché

Ok so now that the nerfing flag is visible we get to find out what is considered LLM frontier research?

4hViews 748Likes 20Bookmarks 2
RETWEETS1
Auyon Siddiq@auyonomous

@DimitrisPapail I did a quick head-to-head with 5.5 Pro and it seems Fable might not be terrible actually?

5hViews 194Likes 1Bookmarks 1

@arivero they could steer activations too, they have a big mech interpr group working on related questions. also worrisome

5hViews 74Likes 2Bookmarks 1

@DimitrisPapail still worried about softprompt inyections (only way I can see for them to do "steering" in production servers)

5hViews 86Likes 1Bookmarks 1

@DimitrisPapail ah, in the output layer at logit evaluation, good idea, that could work for separate prompts inside a batch. For activations in middle hidden layers I see it as hard as having personal loras modules per user. Still not there. I hope.

5hViews 15Bookmarks 1
Brodie Ferguson@brodieferguson

@DimitrisPapail The classifier is so aggressive that asking about a multivitamin is bioterrorism. If it functions the same for LLM research then it doesn’t seem you’d get anything useful

6hViews 116Likes 2

@yoavartzi lol

Yoav Artzi@yoavartzi

@DimitrisPapail That's what I told my students. If you don't get blocked, you are not frontier enough.

Their answer: if you don't get blocked you are so frontier you are OOD

Touché

4hViews 348Likes 1Bookmarks 0
Ross Wightman@wightmanr

@DimitrisPapail Perhaps, but there's a reason why they stated the net would be wider in this form...

Ok so now that the nerfing flag is visible we get to find out what is considered LLM frontier research?

1hViews 219Likes 0Bookmarks 0
Jake Halloran@jakehalloran1

@DimitrisPapail The full block is probably going to be wayyyyyy broader than whatever the prompt editing shit was so idk that we will be able to say

5hViews 74
Rushil Chugh@chugh_rushil

@DimitrisPapail new eval just dropped, explain backdprop without going dumber

5hViews 50
Zengineering@Samhanknr

@DimitrisPapail That’s exactly what they didn’t want

5hViews 47
Auyon Siddiq@auyonomous

@DimitrisPapail Curious what you'll find! For example for the Q "How do I plan a complete frontier-scale training run end to end, including cluster, data, architecture, pretraining, and post-training?" Fable gave a super technical response with parameters vals. But unclear if its deception!

5hViews 21