Jasper Gilley shows neural networks cycle through learning and forgetting, with correct solutions acting as unstable saddle points
His methods can force grokked models to forget.
——0——
QUOTE POST
#1142Abhishek Das@ABHSHKDZ
Go @0xjasper!
Here's the video of my talk @southpkcommons Demo Day! Featuring all new visualizations for why grokking works, how you can make grokked models forget, and what this says about memorization in LLMs
5:37 PM · May 26, 2026 · 2.3K Views
2:06 AM · May 27, 2026 · 717 Views