/Tech3h ago

Amiri Hayes and MIT researchers replace transformer attention heads with human-readable Python code using program synthesis

The resulting hybrid models maintain performance with minimal degradation

415537720.7K

#36

Original post

Amiri Hayes@amirihayes_

What if attention were code? We show that many attention heads in transformer LMs can be replaced by human-readable Python programs. Swap them in and the model barely notices.

See our experiments here: Explaining Attention with Program Synthesis [https://arxiv.org/abs/2606.19317]

8:42 AM · Jun 29, 2026 · 13.9K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

ARXIV.ORGVia

Posts from X

Most Activity

VIEWS10.3KBOOKMARKS39LIKES83RETWEETS2

Jacob Andreas@jacobandreas

👉 New preprint! Automated interpretability by approximating / replacing NN components (here attention heads) with programs.

Amiri Hayes@amirihayes_

What if attention were code? We show that many attention heads in transformer LMs can be replaced by human-readable Python programs. Swap them in and the model barely notices.

See our experiments here: Explaining Attention with Program Synthesis [https://arxiv.org/abs/2606.19317]

2h10.3K8339

Laura Ruis@LauraRuis

Very cool work ⬇️

Amiri Hayes@amirihayes_

What if attention were code? We show that many attention heads in transformer LMs can be replaced by human-readable Python programs. Swap them in and the model barely notices.

See our experiments here: Explaining Attention with Program Synthesis [https://arxiv.org/abs/2606.19317]

2h99671