"why can we forecast the weather in 3 days with strong accuracy but not 300 days" presupposes that there could be a model of thinking unified enough to let you solve exactly for whatever the weather will be in 300 days "where's the theory for why AI works" presupposes... etc
"transfer learning from geometric structure happens when you distribute the solution to a problem thinly across an iterated structure" is the most parsimonious accounting i worry that hoping for a unified general relativity type discovery is kind of a category error in some way