1d ago

A new blog post introduces Synthetic Persona Pretraining to embed desired values directly into pretraining data and reports 1.7 percent mean attack success on 1.7B models

SPP Token Zero beat unfiltered, filtered, and SafeLM baselines across five benchmarks.

โ€”โ€”0โ€”โ€”
Original post

New blog! Synthetic Persona Pretraining (SPP): Alignment from Token Zero Current alignment is shallow - values bolted on after pretraining can be routed around. To solve this, we wrote the desired persona directly into pretraining data. Early results, but we're very excited. ๐Ÿงต

8:02 AM ยท May 20, 2026 View on X
Reposted by

This is super cool work. Great to see open research on this topic!

Julian MinderJulian Minder@jkminder

New blog! Synthetic Persona Pretraining (SPP): Alignment from Token Zero Current alignment is shallow - values bolted on after pretraining can be routed around. To solve this, we wrote the desired persona directly into pretraining data. Early results, but we're very excited. ๐Ÿงต

3:02 PM ยท May 20, 2026 ยท 25.2K Views
3:40 PM ยท May 20, 2026 ยท 2.3K Views

Check out Julian and co's interesting blogpost on how to use synthetic personas during pretraining, for improved safety alignment:

Julian MinderJulian Minder@jkminder

New blog! Synthetic Persona Pretraining (SPP): Alignment from Token Zero Current alignment is shallow - values bolted on after pretraining can be routed around. To solve this, we wrote the desired persona directly into pretraining data. Early results, but we're very excited. ๐Ÿงต

3:02 PM ยท May 20, 2026 ยท 25.2K Views
3:40 PM ยท May 20, 2026 ยท 1.9K Views

Persona research is an entire field unto itself! If you want diverse Persona options, you need to consider them from the very start of your stack! Lots to explore here.

(If you are working on this sort of thing independently, I'm very interested in hiring you!)

Julian MinderJulian Minder@jkminder

New blog! Synthetic Persona Pretraining (SPP): Alignment from Token Zero Current alignment is shallow - values bolted on after pretraining can be routed around. To solve this, we wrote the desired persona directly into pretraining data. Early results, but we're very excited. ๐Ÿงต

3:02 PM ยท May 20, 2026 ยท 25.2K Views
6:25 AM ยท May 21, 2026 ยท 10.4K Views