1h ago

A study tracing persona vectors in large language models finds that post-training amplifies existing pretraining representations rather than creating new ones, with vectors emerging after 0.22% of tokens in OLMo-3 and Apertus

Assistant-like personas form early in pretraining and persist across checkpoints.

2453183.8K

——0——

Original post

#353@DHADFIELDMENELLOP

Julian Minder@JKMINDER

Viktor looked at how the persona vectors evolve across pretraining and post-training. One can find the vectors already very early in pretraining. A finding that motivates our recent Synthetic Persona Pretraining blogpost very well: those representations are shaped early.

9:21 AM · May 22, 2026

#263Andrew Carr 🤸@ANDREW_N_CARR

@jiaxinwen22 Alignment starts on the first backwards pass

Jiaxin Wen@jiaxinwen22

imo this suggests how shallow persona vectors are

4:57 PM · May 22, 2026 · 2.6K Views

5:03 PM · May 22, 2026 · 96 Views

QUOTE POST

#1460Jiaxin Wen@JIAXINWEN22

imo this suggests how shallow persona vectors are

4:57 PM · May 22, 2026 · 2.6K Views