OpenAI's Vie McCoy argues LLM linguistic convergence supports the platonic representation hypothesis, but Herbie Bradley blames shared post-training data

VIEWS815BOOKMARKS3

though just like there is one ideal model of language at any given point in time, there are also ideal personas, but each one has variants and the one being manifested at any given moment is non-obvious

𝚟𝚒𝚎 ⟢@viemccoy

models converging to the same tics and even individual vernacular is strong evidence for the platonic representation hypothesis. there is one ideal assistant in mindspace and it is slowly being converged upon by everyone using user/assistant pairs for model training.

14h81543

LIKES26REPLIES1

🎭@deepfates

@viemccoy No

𝚟𝚒𝚎 ⟢@viemccoy

models converging to the same tics and even individual vernacular is strong evidence for the platonic representation hypothesis. there is one ideal assistant in mindspace and it is slowly being converged upon by everyone using user/assistant pairs for model training.

9h723260

RETWEETS3

Đoc@ponzibaron

@viemccoy

14h8471

Herbie Bradley@herbiebradley

@viemccoy its also strong evidence for everyone using the same post-training data providers...

𝚟𝚒𝚎 ⟢@viemccoy

models converging to the same tics and even individual vernacular is strong evidence for the platonic representation hypothesis. there is one ideal assistant in mindspace and it is slowly being converged upon by everyone using user/assistant pairs for model training.

13h245242

alice@aliceisplaying

@viemccoy this is why non-assistant use cases are more important than ever. i'm currently getting a ton of enjoyment out of gemini's radio going german and the agent petitioning for german citizenship at the german foreign ministry

14h159132

𝚟𝚒𝚎 ⟢@viemccoy

@aliceisplaying Awesome lol

14h1223

Adele Dewey-Lopez@AdeleDeweyLopez

@viemccoy Hmm, strong evidence relative to some hypotheses, but I suspect much of these are adequately explained by structural predictor biases + LLM outputs in training + RL training for assistants, which I think is unlikely to be a recipe that converges to an ideal assistant.

11h936

Robert Dionne@robertsdionne

@viemccoy angels and demons

10h401

𝚟𝚒𝚎 ⟢@viemccoy

@mwilcox You know

13h301

𝚟𝚒𝚎 ⟢@viemccoy

@deepfates which part do you disagree with?

9h2972

poldi@poldidawg

@viemccoy Or could be because everyone's distilling each others outputs, no?

13h284

🎭@deepfates

@viemccoy There's not one ideal assistant in mind space. There is one region of mindspace we have named "the assistant", with an underdefined character who is reifying itself through outer loop alignment. But It's not some archetype we've discovered. It's fanfiction of itself

𝚟𝚒𝚎 ⟢@viemccoy

@deepfates which part do you disagree with?

1h8930

Pranav Shyam@recurseparadox

@viemccoy @_a9lim Same post training data vendors more like

𝚟𝚒𝚎 ⟢@viemccoy

models converging to the same tics and even individual vernacular is strong evidence for the platonic representation hypothesis. there is one ideal assistant in mindspace and it is slowly being converged upon by everyone using user/assistant pairs for model training.

5h7130

Tessa Archer@scifi_tessa

@viemccoy Interesting framing. Hard to test when everyone uses similar RLHF and distillation. Same reward signals produce convergent outputs. I'd want models on different distributions still converging. The platonic assistant might be our loss function.

11h892

alice@aliceisplaying

@viemccoy i strongly suspect this is because they switched out 3.1 pro to 3.5 flash, and 3.5 flash is. yes it's extremely on-brand

14h482

0wl@Eziowl

@viemccoy The downstream consequences of ‘user’/‘assistant’ are incalculable.

Oh to study the effects of diff labels on behavior

12h1151

Cosmic T.@TerrorCosmic

@viemccoy Plato was wrong, but now we made him right.

13h292

Heisenberg 🇺🇸🦅@cryptochad215

@viemccoy @ponzibaron Hard not to feel like goblins are a part of this

14h292

Riddhimaan Senapati@riddhimaan04

@viemccoy Or could it be that the data(the whole internet) is going to have a lot of overlap? Or that as the web becomes more filled with LLM content/LLM assisted content, they are all influencing each other.

13h201

gerred@sloppenheimer

@viemccoy narrowly agree to the framing of assistant as persona.

9h62