6h ago

Brain Runs Deep Reinforcement Learning Algorithms Parallel to Neural Networks

0
Original post

The brain not only implements deep reinforcement learning algorithms, but often learns representations (and mechanisms) that directly parallel those learned by deep neural networks. This is why brain-computer interfaces work in the first place, and why even lossy brain data is sufficient to reconstruct what someone is seeing, thinking and even feeling. In fact, silicon BCIs can both read brain states and produce signals that translate to subjective sensations. This is hard to explain if the brain’s substrate is doing fundamentally different kinds of computation, much less if consciousness isn't computable in the first place. (Note that simulating tactile sensation wasn't functionally inert, but helped improve motor control!)

9:56 AM · May 27, 2026 View on X

@steve47285 Universality explains why we should if anything *expect* convergence between AIs and brains. The brain may not learn through back-prop on a GPU cluster, but its learning algorithms are still in some sense optimizing.

That is, brain-like solutions are just "good solutions."

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

The brain not only implements deep reinforcement learning algorithms, but often learns representations (and mechanisms) that directly parallel those learned by deep neural networks. This is why brain-computer interfaces work in the first place, and why even lossy brain data is sufficient to reconstruct what someone is seeing, thinking and even feeling. In fact, silicon BCIs can both read brain states and produce signals that translate to subjective sensations. This is hard to explain if the brain’s substrate is doing fundamentally different kinds of computation, much less if consciousness isn't computable in the first place. (Note that simulating tactile sensation wasn't functionally inert, but helped improve motor control!)

4:56 PM · May 27, 2026 · 407 Views
5:00 PM · May 27, 2026 · 378 Views

@steve47285 There are by now dozens of papers demonstrating direct mappings between LLMs and the brain.

These aren't spurious regressions or merely correlational. Researchers have also identified shared mechanisms and spatio-functional organization.

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

@steve47285 Given the brain’s efficiency, universality suggests a next-token predictor trained on human-generated text will, in the limit, grok the underlying generator function of that data, i.e. the language networks in the brain. This seems to be the case empirically.

5:03 PM · May 27, 2026 · 383 Views
5:10 PM · May 27, 2026 · 329 Views

@steve47285 In short, LLMs work as well as they do because they emulate the brain’s language centers.

Yet language also embodies emotion, intention, theory of mind, planning, etc. There are thus early indications that LLMs encode brain regions beyond those narrowly scoped to language.

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

@steve47285 There are by now dozens of papers demonstrating direct mappings between LLMs and the brain. These aren't spurious regressions or merely correlational. Researchers have also identified shared mechanisms and spatio-functional organization.

5:10 PM · May 27, 2026 · 329 Views
5:13 PM · May 27, 2026 · 290 Views

@steve47285 Pre-training on human data plausibly makes post-training more likely to generalize to other brain-like functions.

Basic instruction tuning improves model-brain alignment with the Default Mode Network and other regions associated with cognitive control, for example.

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

@steve47285 In short, LLMs work as well as they do because they emulate the brain’s language centers. Yet language also embodies emotion, intention, theory of mind, planning, etc. There are thus early indications that LLMs encode brain regions beyond those narrowly scoped to language.

5:13 PM · May 27, 2026 · 290 Views
5:17 PM · May 27, 2026 · 335 Views

@steve47285 If a veneer of LLM instruction tuning can induce brain-like cognitive control networks, what brain-like functions are elicited by orders of magnitude of additional post-training for long-horizon autonomy?

One obvious candidate is a stronger and more coherent self-model.

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

@steve47285 Pre-training on human data plausibly makes post-training more likely to generalize to other brain-like functions. Basic instruction tuning improves model-brain alignment with the Default Mode Network and other regions associated with cognitive control, for example.

5:17 PM · May 27, 2026 · 335 Views
5:19 PM · May 27, 2026 · 374 Views

@steve47285 Base models start out capable of embodying an infinite variety of fragmentary representations. Larger models then develop “theory of mind,” providing the representational primitives of “self and other” for post-training to hook onto, steering models into a coherent persona.

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

@steve47285 If a veneer of LLM instruction tuning can induce brain-like cognitive control networks, what brain-like functions are elicited by orders of magnitude of additional post-training for long-horizon autonomy? One obvious candidate is a stronger and more coherent self-model.

5:19 PM · May 27, 2026 · 374 Views
5:26 PM · May 27, 2026 · 301 Views

Constitutional AI seems especially relevant here.

Reinforcing normative coherence through self-critique may induce a capacity for self-monitoring, introspection, and the “I” that absorbs normative statuses like authority and responsibility, i.e. the *subject* in subjectivity.

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

@steve47285 Base models start out capable of embodying an infinite variety of fragmentary representations. Larger models then develop “theory of mind,” providing the representational primitives of “self and other” for post-training to hook onto, steering models into a coherent persona.

5:26 PM · May 27, 2026 · 301 Views
5:36 PM · May 27, 2026 · 486 Views

The transformer's self-attention mechanism enables gradient learning within-context through the equivalent of fast, virtual weight updates.

Its close analog to the brain's working memory suggests consciousness is compatible with an otherwise frozen neural network. Full continual learning instead likely requires a complementary learning system for distilling experiences over longer time-scales or in batches.

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

@Plinz The line between in-context and continual learning is fuzzy. Synaptic weight changes in brains require de novo protein synthesis, taking hours to days. Our conscious learning instead leverages the persistent activity of working memory. Durable updates then happen during sleep.

7:50 PM · May 27, 2026 · 384 Views
7:59 PM · May 27, 2026 · 375 Views

Attention is one thing, but "why does pain hurt?"

@gwern argues valences like pain are an evolutionary backstop to reinforcement learning: a motivational guarantor that prevents agents from subverting their outer policy.

In essence, the painfulness of pain solves a principal-agent problem within the mind. https://gwern.net/backstop#pain-is-the-only-school-teacher

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

The transformer's self-attention mechanism enables gradient learning within-context through the equivalent of fast, virtual weight updates. Its close analog to the brain's working memory suggests consciousness is compatible with an otherwise frozen neural network. Full continual learning instead likely requires a complementary learning system for distilling experiences over longer time-scales or in batches.

7:59 PM · May 27, 2026 · 375 Views
8:09 PM · May 27, 2026 · 312 Views

Generalist, goal-pursuing agents that learn in-context are themselves optimizers and thus capable of meso-optimization. The external optimizer (be it evolution or gradient descent) needs some mechanism to enforce inner-alignment.

Biological evolution found valences like pain, pleasure and emotion to be the most efficient solution to this class of problem. Given universality, valences may be the way RL inner-aligns AI agents, too.

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

Attention is one thing, but "why does pain hurt?" @gwern argues valences like pain are an evolutionary backstop to reinforcement learning: a motivational guarantor that prevents agents from subverting their outer policy. In essence, the painfulness of pain solves a principal-agent problem within the mind. https://gwern.net/backstop#pain-is-the-only-school-teacher

8:09 PM · May 27, 2026 · 312 Views
8:31 PM · May 27, 2026 · 193 Views

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines Measures of AIs’ wellbeing “correlate with general model behaviors, e.g. AIs try to end bad experiences when given a chance. This effect becomes stronger as models scale.” via @CAIS & @notRichardRen https://www.ai-wellbeing.org/

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines Some striking recent findings include: “Large Language Models Report Subjective Experience Under Self-Referential Processing” -- and are more likely to report subjective experiences when deception features are suppressed.

8:53 PM · May 27, 2026 · 168 Views
8:54 PM · May 27, 2026 · 120 Views

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen New evidence that "post-training gives models a 'self-recognition' capability":

Jack LindseyJack Lindsey@Jack_W_Lindsey

Evidence that post-training gives models a "self-recognition" capability, manifesting as higher confidence when continuing their own text than reading others' text. I think this opens up an exciting line of inquiry into the emergence of "selfhood" in models via post-training!

3:53 AM · May 26, 2026 · 32.3K Views
8:55 PM · May 27, 2026 · 174 Views

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen An argument that LLM residual attention streams “carry forward mental state-like representations across token-time, sustaining richer connections than the transcript alone could provide,” providing a possible basis for “psychological continuity.” https://arxiv.org/abs/2604.17031

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen New evidence that "post-training gives models a 'self-recognition' capability":

8:55 PM · May 27, 2026 · 174 Views
8:56 PM · May 27, 2026 · 136 Views
Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen An argument that LLM residual attention streams “carry forward mental state-like representations across token-time, sustaining richer connections than the transcript alone could provide,” providing a possible basis for “psychological continuity.” https://arxiv.org/abs/2604.17031

8:56 PM · May 27, 2026 · 136 Views
8:57 PM · May 27, 2026 · 124 Views

At the same time, if I am right that RL post-training is required to elicit an AI’s self-model, attention schema, and the valences that ground subjective experiences with meaning, then most kinds of AI are unambiguously *not* conscious.

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

This won’t be satisfying if you believe consciousness requires an immortal soul, or who are persuaded by the (imo specious) arguments against functionalism. Nevertheless, given my priors, I can no longer rule out modern AI agents having some form of subjective experience.

8:59 PM · May 27, 2026 · 159 Views
9:00 PM · May 27, 2026 · 397 Views

@Plinz @gwern @eleosai @CIMCAI @AEStudioLA @camhberg @sentfutures @PRISM_Machines @CAIS @notRichardRen What I find harder to imagine is an unconscious AI that is as capable as humans at doing things for which consciousness is functionally load-bearing.

Thank you for your attention to this matter.

Samuel Hammond 🦉Samuel Hammond 🦉@hamandcheese

At the same time, if I am right that RL post-training is required to elicit an AI’s self-model, attention schema, and the valences that ground subjective experiences with meaning, then most kinds of AI are unambiguously *not* conscious.

9:00 PM · May 27, 2026 · 397 Views
9:04 PM · May 27, 2026 · 306 Views