2h ago

Dean W. Ball argues Anthropic's Chris Olah holds scientific views that contradict a papal encyclical on AI capabilities

Super Dario argued the comparison misinterprets Olah's cognitive research

630022.7K

——0——

Original post

#1776Super Dario@INDUCTIONHEADS

Rare bad take from Dean “Functionally mirror” obviously does not mean they have emotions in the same sense humans do

5:55 PM · May 25, 2026

#392Dean W. Ball@DEANWBALL

@inductionheads Where did I say the models have emotions?

Super Dario@inductionheads

Rare bad take from Dean “Functionally mirror” obviously does not mean they have emotions in the same sense humans do

12:55 AM · May 26, 2026 · 1.9K Views

12:58 AM · May 26, 2026 · 515 Views

QUOTE POST

#392Dean W. Ball@DEANWBALL

@inductionheads I don’t think models have human-like emotions… I was pointing out that Olah and the Church are contradicting one another on what seems like a fundamental point.

Dean W. Ball@deanwball

I guess I’ve never written down my actual thoughts on AI cognition/consciousness/emotion. Here goes: It is clear AIs can think, in the reasoning sense. That does not mean they think exactly like humans. It seems like there are some similarities in how we think, but also very stark differences. Nonetheless, if your definition of “thinking” excludes “the ability to make genuinely new contributions to famous math problems,” it is your definition that has a problem, not AI. The ability to think does not necessarily imply the ability to feel emotion in a way that would be understandable to humans, and it does not imply that AIs have anything like consciousness in a way that humans would relate to. It may, it may not. We do not know, because our understanding of the underlying concepts of human emotional cognition and especially consciousness remains quite poor. There is some evidence that models experience emotions, but it is really hard to disentangle this from the next-token prediction training objective (if the model is telling a sad story, wouldn’t you expect features within the model that relate to the sadness emotion to activate), and the character training the model undergoes in post-training. There is a difference between “I am sad” and “the character I have been trained to play is supposed to feel sad, so now I will act sad.” We basically know for sure that the models do the latter at the very least; we don’t really know if they do the former. Consider: does Sora (a video-generation model) feel sad when it is asked to make a sad video? Does Midjourney dislike making certain kinds of images? Does a Waymo get scared? It doesn’t feel like the answer to any of these is yes (though again, maybe!), but these too are neural networks. Is the fact that models are trained on words mean that they somehow learn emotion, or are we just being tempted to anthropomorphize because the language models communicate with us in a way that “feels” human? My suspicion is kind of the latter. It also seems quite clear from the empirical evidence that models possess the ability to model themselves. That’s not really that surprising. At sufficient scale, it is useful to have a model of your own state to succeed at the next-token prediction objective (and the later reinforcement-based reasoning training). Once the tasks models are trained on are sufficient complex, they cannot succeed in training by being automatons; someone needs to step into the cockpit, so to speak, and fly the plane. Is this self awareness? Maybe. Is it consciousness? Probably not as humans understand it. All I can tell you is it is a model’s model of itself. It may be something more than that, too, but I don’t know. This is all very weird, very outside the Overton, and very confusing. I don’t really know what to say, beyond that we should take this stuff seriously, have an open mind, and do rigorous science. Anyone who speaks with confidence about this in either direction is just fooling themselves. We also need to be prepared for the very possible scenario that, despite our best efforts, we do not make real progress on these questions anytime soon. We may just be in the dark for a while, navigating under unflinching ambiguity. There may be no satisfying conclusion.

12:18 AM · May 26, 2026 · 25.6K Views

12:59 AM · May 26, 2026 · 316 Views