/Tech6h ago

Academic Yoav Artzi jokes that AI research only qualifies as 'frontier' if it gets blocked by API providers

Students joked that unblocked work is out of distribution.

1499346.7K

#203

Original post

Dimitris Papailiopoulos@DimitrisPapail#203inTech

Ok so now that the nerfing flag is visible we get to find out what is considered LLM frontier research?

7:29 AM · Jun 11, 2026 · 5.5K Views

/Tech6h ago

Academic Yoav Artzi jokes that AI research only qualifies as 'frontier' if it gets blocked by API providers

Students joked that unblocked work is out of distribution.

1499346.7K

#203

Original post

Dimitris Papailiopoulos@DimitrisPapail#203inTech

Ok so now that the nerfing flag is visible we get to find out what is considered LLM frontier research?

7:29 AM · Jun 11, 2026 · 5.5K Views

Sentiment

Many users criticize the aggressive safety classifier for LLM frontier research as overly restrictive and worrisome for enabling activation steering, while a few see potential in logit evaluation techniques.

Pos

25.0%

Neg

75.0%

4 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS748BOOKMARKS2LIKES20REPLIES2

Yoav Artzi@yoavartzi

@DimitrisPapail That's what I told my students. If you don't get blocked, you are not frontier enough.

Their answer: if you don't get blocked you are so frontier you are OOD

Touché

Dimitris Papailiopoulos@DimitrisPapail

Ok so now that the nerfing flag is visible we get to find out what is considered LLM frontier research?

4h748202

RETWEETS1

Auyon Siddiq@auyonomous

@DimitrisPapail I did a quick head-to-head with 5.5 Pro and it seems Fable might not be terrible actually?

5h19411

Dimitris Papailiopoulos@DimitrisPapail

@arivero they could steer activations too, they have a big mech interpr group working on related questions. also worrisome

5h7421

Alejandro Rivero@arivero

@DimitrisPapail still worried about softprompt inyections (only way I can see for them to do "steering" in production servers)

5h8611

Alejandro Rivero@arivero

@DimitrisPapail ah, in the output layer at logit evaluation, good idea, that could work for separate prompts inside a batch. For activations in middle hidden layers I see it as hard as having personal loras modules per user. Still not there. I hope.

5h151

Dimitris Papailiopoulos@DimitrisPapail

@auyonomous ok gotta to try it out more thoroughly

5h1001

Brodie Ferguson@brodieferguson

@DimitrisPapail The classifier is so aggressive that asking about a multivitamin is bioterrorism. If it functions the same for LLM research then it doesn’t seem you’d get anything useful

6h1162

Dimitris Papailiopoulos@DimitrisPapail

@yoavartzi lol

Yoav Artzi@yoavartzi

@DimitrisPapail That's what I told my students. If you don't get blocked, you are not frontier enough.

Their answer: if you don't get blocked you are so frontier you are OOD

Touché

4h34810

Ross Wightman@wightmanr

@DimitrisPapail Perhaps, but there's a reason why they stated the net would be wider in this form...

Dimitris Papailiopoulos@DimitrisPapail

Ok so now that the nerfing flag is visible we get to find out what is considered LLM frontier research?

1h21900

Jake Halloran@jakehalloran1

@DimitrisPapail The full block is probably going to be wayyyyyy broader than whatever the prompt editing shit was so idk that we will be able to say

5h74

Rushil Chugh@chugh_rushil

@DimitrisPapail new eval just dropped, explain backdprop without going dumber

5h50

Zengineering@Samhanknr

@DimitrisPapail That’s exactly what they didn’t want

5h47

Auyon Siddiq@auyonomous

@DimitrisPapail Curious what you'll find! For example for the Q "How do I plan a complete frontier-scale training run end to end, including cluster, data, architecture, pretraining, and post-training?" Fable gave a super technical response with parameters vals. But unclear if its deception!

5h21