GoodfireAI researchers identify geometric addition module in Llama
Researchers at GoodfireAI identified an internal computation module in large language models that performs addition by rotating shapes in neural representation space. The mechanism acts as a general calculator in Llama and was detailed in a new blog post and paper from the GoodfireAI team. It extends to other domains using similar geometric structures in hidden layers.
One of the most fun results we've pulled together on the neural geometry research thread: Llama has a general computation module for performing addition, but uses geometry to map in and out of the domain! Check out some of the prettiest plots ever in the blog/paper haha.
Neural networks do math by rotating shapes. We found a shape-rotating calculator hidden inside an LLM – and it’s used for more than just math! (1/6)