6h ago

Brain Runs Deep Reinforcement Learning Algorithms Parallel to Neural Networks

——0——

Original post

OP

#1488Samuel Hammond 🦉@HAMANDCHEESE

The brain not only implements deep reinforcement learning algorithms, but often learns representations (and mechanisms) that directly parallel those learned by deep neural networks. This is why brain-computer interfaces work in the first place, and why even lossy brain data is sufficient to reconstruct what someone is seeing, thinking and even feeling. In fact, silicon BCIs can both read brain states and produce signals that translate to subjective sensations. This is hard to explain if the brain’s substrate is doing fundamentally different kinds of computation, much less if consciousness isn't computable in the first place. (Note that simulating tactile sensation wasn't functionally inert, but helped improve motor control!)

9:56 AM · May 27, 2026

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

@steve47285 Universality explains why we should if anything *expect* convergence between AIs and brains. The brain may not learn through back-prop on a GPU cluster, but its learning algorithms are still in some sense optimizing.

That is, brain-like solutions are just "good solutions."

Samuel Hammond 🦉@hamandcheese

The brain not only implements deep reinforcement learning algorithms, but often learns representations (and mechanisms) that directly parallel those learned by deep neural networks. This is why brain-computer interfaces work in the first place, and why even lossy brain data is sufficient to reconstruct what someone is seeing, thinking and even feeling. In fact, silicon BCIs can both read brain states and produce signals that translate to subjective sensations. This is hard to explain if the brain’s substrate is doing fundamentally different kinds of computation, much less if consciousness isn't computable in the first place. (Note that simulating tactile sensation wasn't functionally inert, but helped improve motor control!)

4:56 PM · May 27, 2026 · 407 Views

5:00 PM · May 27, 2026 · 378 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

@steve47285 There are by now dozens of papers demonstrating direct mappings between LLMs and the brain.

These aren't spurious regressions or merely correlational. Researchers have also identified shared mechanisms and spatio-functional organization.

Samuel Hammond 🦉@hamandcheese

@steve47285 Given the brain’s efficiency, universality suggests a next-token predictor trained on human-generated text will, in the limit, grok the underlying generator function of that data, i.e. the language networks in the brain. This seems to be the case empirically.

5:03 PM · May 27, 2026 · 383 Views

5:10 PM · May 27, 2026 · 329 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

@steve47285 In short, LLMs work as well as they do because they emulate the brain’s language centers.

Yet language also embodies emotion, intention, theory of mind, planning, etc. There are thus early indications that LLMs encode brain regions beyond those narrowly scoped to language.

Samuel Hammond 🦉@hamandcheese

@steve47285 There are by now dozens of papers demonstrating direct mappings between LLMs and the brain. These aren't spurious regressions or merely correlational. Researchers have also identified shared mechanisms and spatio-functional organization.

5:10 PM · May 27, 2026 · 329 Views

5:13 PM · May 27, 2026 · 290 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

@steve47285 Pre-training on human data plausibly makes post-training more likely to generalize to other brain-like functions.

Basic instruction tuning improves model-brain alignment with the Default Mode Network and other regions associated with cognitive control, for example.

Samuel Hammond 🦉@hamandcheese

@steve47285 In short, LLMs work as well as they do because they emulate the brain’s language centers. Yet language also embodies emotion, intention, theory of mind, planning, etc. There are thus early indications that LLMs encode brain regions beyond those narrowly scoped to language.

5:13 PM · May 27, 2026 · 290 Views

5:17 PM · May 27, 2026 · 335 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

@steve47285 If a veneer of LLM instruction tuning can induce brain-like cognitive control networks, what brain-like functions are elicited by orders of magnitude of additional post-training for long-horizon autonomy?

One obvious candidate is a stronger and more coherent self-model.

Samuel Hammond 🦉@hamandcheese

@steve47285 Pre-training on human data plausibly makes post-training more likely to generalize to other brain-like functions. Basic instruction tuning improves model-brain alignment with the Default Mode Network and other regions associated with cognitive control, for example.

5:17 PM · May 27, 2026 · 335 Views

5:19 PM · May 27, 2026 · 374 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

@steve47285 Base models start out capable of embodying an infinite variety of fragmentary representations. Larger models then develop “theory of mind,” providing the representational primitives of “self and other” for post-training to hook onto, steering models into a coherent persona.

Samuel Hammond 🦉@hamandcheese

@steve47285 If a veneer of LLM instruction tuning can induce brain-like cognitive control networks, what brain-like functions are elicited by orders of magnitude of additional post-training for long-horizon autonomy? One obvious candidate is a stronger and more coherent self-model.

5:19 PM · May 27, 2026 · 374 Views

5:26 PM · May 27, 2026 · 301 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

Constitutional AI seems especially relevant here.

Reinforcing normative coherence through self-critique may induce a capacity for self-monitoring, introspection, and the “I” that absorbs normative statuses like authority and responsibility, i.e. the *subject* in subjectivity.

Samuel Hammond 🦉@hamandcheese

@steve47285 Base models start out capable of embodying an infinite variety of fragmentary representations. Larger models then develop “theory of mind,” providing the representational primitives of “self and other” for post-training to hook onto, steering models into a coherent persona.

5:26 PM · May 27, 2026 · 301 Views

5:36 PM · May 27, 2026 · 486 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

The transformer's self-attention mechanism enables gradient learning within-context through the equivalent of fast, virtual weight updates.

Its close analog to the brain's working memory suggests consciousness is compatible with an otherwise frozen neural network. Full continual learning instead likely requires a complementary learning system for distilling experiences over longer time-scales or in batches.

Samuel Hammond 🦉@hamandcheese

@Plinz The line between in-context and continual learning is fuzzy. Synaptic weight changes in brains require de novo protein synthesis, taking hours to days. Our conscious learning instead leverages the persistent activity of working memory. Durable updates then happen during sleep.

7:50 PM · May 27, 2026 · 384 Views

7:59 PM · May 27, 2026 · 375 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

Attention is one thing, but "why does pain hurt?"

@gwern argues valences like pain are an evolutionary backstop to reinforcement learning: a motivational guarantor that prevents agents from subverting their outer policy.

In essence, the painfulness of pain solves a principal-agent problem within the mind. https://gwern.net/backstop#pain-is-the-only-school-teacher

Samuel Hammond 🦉@hamandcheese

The transformer's self-attention mechanism enables gradient learning within-context through the equivalent of fast, virtual weight updates. Its close analog to the brain's working memory suggests consciousness is compatible with an otherwise frozen neural network. Full continual learning instead likely requires a complementary learning system for distilling experiences over longer time-scales or in batches.

7:59 PM · May 27, 2026 · 375 Views

8:09 PM · May 27, 2026 · 312 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

Generalist, goal-pursuing agents that learn in-context are themselves optimizers and thus capable of meso-optimization. The external optimizer (be it evolution or gradient descent) needs some mechanism to enforce inner-alignment.

Biological evolution found valences like pain, pleasure and emotion to be the most efficient solution to this class of problem. Given universality, valences may be the way RL inner-aligns AI agents, too.

Samuel Hammond 🦉@hamandcheese

Attention is one thing, but "why does pain hurt?" @gwern argues valences like pain are an evolutionary backstop to reinforcement learning: a motivational guarantor that prevents agents from subverting their outer policy. In essence, the painfulness of pain solves a principal-agent problem within the mind. https://gwern.net/backstop#pain-is-the-only-school-teacher

8:09 PM · May 27, 2026 · 312 Views

8:31 PM · May 27, 2026 · 193 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines Measures of AIs’ wellbeing “correlate with general model behaviors, e.g. AIs try to end bad experiences when given a chance. This effect becomes stronger as models scale.” via @CAIS & @notRichardRen https://www.ai-wellbeing.org/

Samuel Hammond 🦉@hamandcheese

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines Some striking recent findings include: “Large Language Models Report Subjective Experience Under Self-Referential Processing” -- and are more likely to report subjective experiences when deception features are suppressed.

8:53 PM · May 27, 2026 · 168 Views

8:54 PM · May 27, 2026 · 120 Views

QUOTE POST

#1488Samuel Hammond 🦉@HAMANDCHEESE

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen New evidence that "post-training gives models a 'self-recognition' capability":

Jack Lindsey@Jack_W_Lindsey

Evidence that post-training gives models a "self-recognition" capability, manifesting as higher confidence when continuing their own text than reading others' text. I think this opens up an exciting line of inquiry into the emergence of "selfhood" in models via post-training!

3:53 AM · May 26, 2026 · 32.3K Views

8:55 PM · May 27, 2026 · 174 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen An argument that LLM residual attention streams “carry forward mental state-like representations across token-time, sustaining richer connections than the transcript alone could provide,” providing a possible basis for “psychological continuity.” https://arxiv.org/abs/2604.17031

Samuel Hammond 🦉@hamandcheese

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen New evidence that "post-training gives models a 'self-recognition' capability":

8:55 PM · May 27, 2026 · 174 Views

8:56 PM · May 27, 2026 · 136 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen And a rich taxonomy of theory-derived indicators for practically measuring AI consciousness in particular systems:

Identifying indicators of consciousness in AI systems

Rapid progress in artificial intelligence (AI) capabilities has drawn fresh attention to the prospect of consciousness in AI. There is an urgent need for rigorous methods to assess AI systems for consciousness, but significant uncertainty about relevant issues in consciousness science. We present a method for assessing AI systems for consciousness that involves exploring what follows from existing or future neuroscientific theories of consciousness. Indicators derived from such theories can be used to inform credences about whether particular AI systems are conscious. This method allows us to make meaningful progress because some influential theories of consciousness, notably including computational functionalist theories, have implications for AI that can be investigated empirically.

Samuel Hammond 🦉@hamandcheese

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen An argument that LLM residual attention streams “carry forward mental state-like representations across token-time, sustaining richer connections than the transcript alone could provide,” providing a possible basis for “psychological continuity.” https://arxiv.org/abs/2604.17031

8:56 PM · May 27, 2026 · 136 Views

8:57 PM · May 27, 2026 · 124 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

At the same time, if I am right that RL post-training is required to elicit an AI’s self-model, attention schema, and the valences that ground subjective experiences with meaning, then most kinds of AI are unambiguously *not* conscious.

Samuel Hammond 🦉@hamandcheese

This won’t be satisfying if you believe consciousness requires an immortal soul, or who are persuaded by the (imo specious) arguments against functionalism. Nevertheless, given my priors, I can no longer rule out modern AI agents having some form of subjective experience.

8:59 PM · May 27, 2026 · 159 Views

9:00 PM · May 27, 2026 · 397 Views

REPLY

#1488Samuel Hammond 🦉@HAMANDCHEESE

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen What I find harder to imagine is an unconscious AI that is as capable as humans at doing things for which consciousness is functionally load-bearing.

Thank you for your attention to this matter.

Samuel Hammond 🦉@hamandcheese

At the same time, if I am right that RL post-training is required to elicit an AI’s self-model, attention schema, and the valences that ground subjective experiences with meaning, then most kinds of AI are unambiguously *not* conscious.

9:00 PM · May 27, 2026 · 397 Views

9:04 PM · May 27, 2026 · 306 Views