Ok so now that the nerfing flag is visible we get to find out what is considered LLM frontier research?
Academic Yoav Artzi jokes that AI research only qualifies as 'frontier' if it gets blocked by API providers
Students joked that unblocked work is out of distribution.
Many users criticize the aggressive safety classifier for LLM frontier research as overly restrictive and worrisome for enabling activation steering, while a few see potential in logit evaluation techniques.
Most Activity
@DimitrisPapail That's what I told my students. If you don't get blocked, you are not frontier enough.
Their answer: if you don't get blocked you are so frontier you are OOD
Touché
Ok so now that the nerfing flag is visible we get to find out what is considered LLM frontier research?

@DimitrisPapail I did a quick head-to-head with 5.5 Pro and it seems Fable might not be terrible actually?

@arivero they could steer activations too, they have a big mech interpr group working on related questions. also worrisome

@DimitrisPapail still worried about softprompt inyections (only way I can see for them to do "steering" in production servers)

@DimitrisPapail ah, in the output layer at logit evaluation, good idea, that could work for separate prompts inside a batch. For activations in middle hidden layers I see it as hard as having personal loras modules per user. Still not there. I hope.

@auyonomous ok gotta to try it out more thoroughly

@DimitrisPapail The classifier is so aggressive that asking about a multivitamin is bioterrorism. If it functions the same for LLM research then it doesn’t seem you’d get anything useful
@yoavartzi lol
@DimitrisPapail That's what I told my students. If you don't get blocked, you are not frontier enough.
Their answer: if you don't get blocked you are so frontier you are OOD
Touché
@DimitrisPapail Perhaps, but there's a reason why they stated the net would be wider in this form...
Ok so now that the nerfing flag is visible we get to find out what is considered LLM frontier research?

@DimitrisPapail The full block is probably going to be wayyyyyy broader than whatever the prompt editing shit was so idk that we will be able to say

@DimitrisPapail new eval just dropped, explain backdprop without going dumber

@DimitrisPapail That’s exactly what they didn’t want

@DimitrisPapail Curious what you'll find! For example for the Q "How do I plan a complete frontier-scale training run end to end, including cluster, data, architecture, pretraining, and post-training?" Fable gave a super technical response with parameters vals. But unclear if its deception!