
1/ We study “digital twins” of macaque V1/V4 -- vision models trained to predict the activity of biological neurons in the primate visual cortex -- and use their outputs to study how the brain structures the world.
Users are thanking the Enigma Project team and collaborators for their research using vision-language models to describe vision neurons with words.
No Digg Deeper questions have been answered for this story yet.

1/ We study “digital twins” of macaque V1/V4 -- vision models trained to predict the activity of biological neurons in the primate visual cortex -- and use their outputs to study how the brain structures the world.

4/ But is the hypothesis right? Generate new images from it → test them on the twin. In V1, this rediscovers the known selectivity for oriented gratings and in V4 the words drive 96.1% of neurons driven above the 95th percentile of natural responses.

Many thanks to Nikos Karantzas, @kfrankelab, @AToliasLab @naturecomputes @SuryaGanguli @TamarRottShaham and the rest of the team at the Enigma Project!

7/ Models promise agentic discovery, but rarely define how to uncover and verify findings. Here, researchers work in tandem with models to deepen understanding. Explore more on our website + read the paper 👇
Website: https://enigma-brain.github.io/letting-the-neural-code-speak/ Paper: https://arxiv.org/pdf/2605.12485

5/ Why does this work? Vision, language, and neural activity partially share a common geometry!

2/ Feature selectivity is visually interpretable -- but how can we reliably scale this? We "translate" images to text with a dense caption. While VLMs get stuck on semantics, language provides a human-interpretable discretization of visual input that a powerful LLM can reason over.

6/ This UMAP of V4 neurons, annotated with hypothesis keywords, shows smooth semantic transitions across the population.

3/ Screen 1M+ images on the twin → take each neuron’s most- and least-activating stimuli → distill their captions into one hypothesis.