4h ago

Jiaxin Wen positions vintage language models such as Talkie as baselines for testing pre-training and post-training technique interactions rather than rediscovering results like relativity

Alexander Doria prefers synthetic pretraining for controlled model behavior experiments.

0
Original post

What's the most valuable thing you can do with vintage LMs like Talkie? I think people are misled by Demis's pitch about rediscovering Relativity. Vintage LMs are just great baselines for LM science, letting you test many hypotheses about how pre-training and post-training interact.

8:36 AM · May 23, 2026 View on X

@jiaxinwen22 I like vintage models a lot (likely trained the first one ever) but synthetic pretraining in general is a better frame for controlled experiments.

Jiaxin WenJiaxin Wen@jiaxinwen22

What's the most valuable thing you can do with vintage LMs like Talkie? I think people are misled by Demis's pitch about rediscovering Relativity. Vintage LMs are just great baselines for LM science, letting you test many hypotheses about how pre-training and post-training interact.

3:36 PM · May 23, 2026 · 1.6K Views
4:05 PM · May 23, 2026 · 283 Views
Jiaxin Wen positions vintage language models such as Talkie as baselines for testing pre-training and post-training technique interactions rather than rediscovering results like relativity · Digg