5h ago

Research Shows 2-Layer Models Memorize Graphs With Minimal Eigendirections

3631272

——0——

Original post

Also added clearer plots showing something unexplained by any existing theory about word2vec/node2vec models: a wide 2-layer model under CE loss somehow uses only a minimal set of eigendirections to memorize a graph even w/o regularization! (e.g., 2 directions for cycle graph)

9:21 AM · May 26, 2026

#1305Vaishnavh Nagarajan@_VAISHNAVH

We also studied how different hyperparameters (lr, wd, initialization) in training affects whether the model memorizes geometrically or associatively. It seems like there are various knobs that make the model memorize. Hope this helps as a starting point for some theory!

Vaishnavh Nagarajan@_vaishnavh

4:21 PM · May 26, 2026 · 94 Views

4:21 PM · May 26, 2026 · 77 Views

#1305Vaishnavh Nagarajan@_VAISHNAVH

We also tried our best to make the paper as short and load-able as we could...

tagging co-authors @ShNoroozi (who led the work) and @ElanRosenfeld

Meet us at ICML if you want to chat!

arxiv.org

Deep sequence models tend to memorize geometrically; it is unclear why

Deep sequence models are said to store atomic facts predominantly in the form of associative memory: a brute-force lookup of co-occurring entities. We identify a dramatically different form of...

Vaishnavh Nagarajan@_vaishnavh

4:21 PM · May 26, 2026 · 77 Views

4:21 PM · May 26, 2026 · 101 Views