Anthropic reportedly restricts its models from assisting with self-improving machine learning tasks to prevent competitors from building rival frontier LLMs

VIEWS14.9KBOOKMARKS26LIKES106REPLIES6

Oh great - Anthropic assumes Semi Analysis is developing a competing LLM and so it dumbs down their model for them, because Semi Analysis does analysis on cutting-edge GPU research.

Such a weird timeline to be in. Anthropic trying to limit competition limits many others…

1h14.9K10626

RETWEETS204

Daniel Auras@rasdani_

this is the biggest wake-up call to protect and nourish open source AI

if you don't build out sovereign and independent models+infra closed labs will patronize you to an insulting degree

elie@eliebakouch

mythos will be bad ON PURPOSE on ai "frontier llm research" tasks, this is very very sad for the research community

also the fact that this is un purpose not visible to the user is crazy

15h42.9K1.4K110

elie@eliebakouch

i'm having a really hard time understanding how this can be a good decision

> lying to the user by modifying the weights/prompt sets a very bad precedent and is extremely unaligned > there is 0 public communication from anthropic about it except a section hidden in a 319 page system card > it's impossible to know the scope of this safeguard. if you are doing a PR to pytorch does this count? if you are working on kernel development? data collection pipeline for a new eval? this will create a paranoia for every researcher in the field > you actually don't know how your model is modified, if it's PEFT (modification at the weight level) or steering does this mean your other queries are also biased? is it at the user level or organization level?

there is also the more "moral" argument that the reason why anthropic is able to train this model is ai researchers who will not have access to the model's capabilities anymore. even if you consider that this is the right thing to do, doing it like that is just a lack of respect to the ai research community

in addition to all of that, it's not clear if the safeguard acts on "model autonomy" or "model capabilities" to do ai research. this is very different and my understanding is that it's the latter, and there is almost 0 RESULTS about this in the system card except a vague "2.3.6 Internal measures of AI R&D acceleration" section citing the previous RSI blog so let's look at it:

the only eval targeting research shows a ~5 point improvement between opus 4.8 -> mythos, but opus 4.7 -> opus 4.8 was a 4 point improvement. obviously not the same if the 5 point improvement led to solving significantly harder tasks, but then, let's be transparent about this evals and make it more details: difficulty filtering, example of what it could look like from public library?

the other AI R&D capabilities evals in the system card are actually not relevant anymore according to anthropic's own words:

"Claude Mythos 5, like Claude Mythos Preview and Claude Opus 4.7, exceeds top human performance thresholds on all but one of these tasks. The suite therefore no longer provides evidence that the model's capabilities are short of our risk thresholds"

only one that is not saturated is the "Novel Compiler" one, if you look at LLM training one (which they consider saturated) it's about how much the model can speedup the training of a small model on a CPU, i don't think anyone would say this is a good proxy for taking a decision to restrict capabilities of the model for ai researcher

idk honestly this feels wrong at so many levels

47m1.3K8912

Gergely Orosz@GergelyOrosz

And they are nerfing Semi Analysis already… it’s not theoretical

I don’t want to pay a premium for a model like this

Gergely Orosz@GergelyOrosz

Oh great - Anthropic assumes Semi Analysis is developing a competing LLM and so it dumbs down their model for them, because Semi Analysis does analysis on cutting-edge GPU research.

Such a weird timeline to be in. Anthropic trying to limit competition limits many others…

1h5.1K353

Ross Taylor@rosstaylor90

Hopefully it is obvious now that if your country’s sovereign AI strategy does not concentrate on the model layer, it is going to have a hard time.

All advanced technology is now downstream of model intelligence.

1h845166

Lucas Beyer (bl16)@giffmana

looool that's the "hey bigcos, we don't want you to catch up, but please keep paying us shitton" clause.

NomoreID@Hangsiin

When Fable 5 is used for frontier LLM development, it does not notify the user and instead limits the model’s capabilities through methods such as prompt modification, steering vectors, and PEFT.

Anthropic estimated that this would affect approximately 0.03% of traffic.

15h26.1K45139

Ian Osband@IanOsband

tfw Fable is still very happy to help me with Delightful Policy Gradient 💔

elie@eliebakouch

mythos will be bad ON PURPOSE on ai "frontier llm research" tasks, this is very very sad for the research community

also the fact that this is un purpose not visible to the user is crazy

51m1.6K280

POM@peterom

@giffmana You don't understand Lucas, it's for safety

15h81611

L@llllvvuu

@giffmana still charging for the tokens is kind of diabolical

15h55012

Lucas Beyer (bl16)@giffmana

@peterom Ah right sorry how could i forget

14h32112

Alex YGift@Radipdegen

@giffmana "affects 0.03% of traffic" doing a lot of heavy lifting there

15h3734

elie@eliebakouch

> it's about how much the model can speedup the training of a small model on a CPU, i don't think anyone would say this is a good proxy for taking a decision to restrict capabilities of the model for ai researcher

not saying that anthropic beleive that btw, but i'm just hilighting the lack of transparency or good evals when the "llm training eval" is this