As LLMs surpass humans on many fronts, how can we keep training stronger models?
Our ICML 2026 paper studies this via weak-to-strong generalization and shows that learning when to trust the weak teacher may be key.
Trust Functions: Near-Lossless Weak-to-Strong Generalization 🧵
Our internal data shows Claude is accelerating AI development—a possible path to recursive self-improvement, or AI autonomously building a more capable successor.
It’s happening faster than we thought, and the implications deserve greater attention. https://www.anthropic.com/institute/recursive-self-improvement