6h ago

A study tracing persona vectors in large language models finds that post-training amplifies existing pretraining representations rather than creating new ones, with vectors emerging after 0.22% of tokens in OLMo-3 and Apertus

Assistant-like personas form early in pretraining and persist across checkpoints.

β€”β€”0β€”β€”
Original post

Viktor looked at how the persona vectors evolve across pretraining and post-training. One can find the vectors already very early in pretraining. A finding that motivates our recent Synthetic Persona Pretraining blogpost very well: those representations are shaped early.

9:21 AM Β· May 22, 2026 View on X

@jiaxinwen22 Alignment starts on the first backwards pass

Jiaxin WenJiaxin Wen@jiaxinwen22

imo this suggests how shallow persona vectors are

4:57 PM Β· May 22, 2026 Β· 6.2K Views
5:03 PM Β· May 22, 2026 Β· 181 Views
A study tracing persona vectors in large language models finds that post-training amplifies existing pretraining representations rather than creating new ones, with vectors emerging after 0.22% of tokens in OLMo-3 and Apertus Β· Digg