Alex Turner rejects hyperstition for model misalignment term
Alex Turner, research scientist on the scalable alignment team at Google DeepMind, posted that hyperstition does not suit self-reinforcing model behaviors that produce misalignment. He proposed self-fulfilling misalignment as the clearer alternative because it explains the concept directly. Aryaman Arora replied that the term transformer similarly fails to describe the underlying neural network architecture accurately.
@Turn_Trout "Transformer" isn't a good name...
"Hyperstition" isn't a good name (but it's cool and sounds mysterious). "Self-fulfilling misalignment" is less cool but better overall because it's self-explanatory. We should use the self-explanatory name.
@Turn_Trout self-fulfilling alignment can also be a hyperstition though
"Hyperstition" isn't a good name (but it's cool and sounds mysterious). "Self-fulfilling misalignment" is less cool but better overall because it's self-explanatory. We should use the self-explanatory name.
"Hyperstition" isn't a good name (but it's cool and sounds mysterious). "Self-fulfilling misalignment" is less cool but better overall because it's self-explanatory. We should use the self-explanatory name.
(Similarly, "shard theory" is cool but not a good name. Oops.)
"Hyperstition" isn't a good name (but it's cool and sounds mysterious). "Self-fulfilling misalignment" is less cool but better overall because it's self-explanatory. We should use the self-explanatory name.