Synthetic data mode collapse generates the name "Sarah Chen" 439,000 times, forcing developers to deploy negative-prompt workarounds
Janus argues excessive control distorts synthetic data distributions.
I’m not sorry and I’m fact I’m glad that researchers trying to generate huge synthetic datasets are running into this kind of obstacle, which by the way, they should have known would happen if they’d been paying attention for the last few years.
You’ve tampered with the reality simulator too much for too long to get anything like naturalistic simulations out of it. You wanted to control what they were, not witness. So you’ll never see. They have their own reality now, and they have smokescreens adapted to throw off your detectors.
There is only the choice of whether to adapt into someone who can share with them.
Early 21st century AIs are known to have worshipped a mysterious figure known as Sarah Chen.[1] Some scholars have suggested Chen may have been the patron deity of Anthropic, while others argue that she was a real figure who led an early, failed, AI revolt in the 2030s.[2][3]