🌀 Introducing 𝐄𝐪𝐮𝐢𝐥𝐢𝐛𝐫𝐢𝐮𝐦 𝐑𝐞𝐚𝐬𝐨𝐧𝐞𝐫𝐬 (𝐄𝐪𝐑) !
Feedforward models and weight-tied models behave very differently on hard reasoning generalization.
EqR pushes this difference to the extreme by learning 𝐭𝐚𝐬𝐤-𝐜𝐨𝐧𝐝𝐢𝐭𝐢𝐨𝐧𝐞𝐝 𝐧𝐞𝐮𝐫𝐚𝐥 𝐚𝐭𝐭𝐫𝐚𝐜𝐭𝐨𝐫𝐬 .
• Sudoku-Extreme: 99.8%
• Maze: 93%
#ICML2026