6h ago

Nando de Freitas unveils interventional SFT for agent models

101281317620.8K

——0——

Nando de Freitas, vice president of AI at Microsoft, outlined interventional SFT to stop agentic language models from self-confirming delusions. The method modifies supervised fine-tuning to ignore action tokens and train only on observations. Experiments with over 30 prompts yielded better factual selection than conventional approaches. Supporting code and examples appear on love4all.ai and GitHub.

Original post

Nando de Freitas#29@NANDODF

One line of code is all it takes to prevent LLM agent delusions, instead of post-training patches like RL. https://love4all.ai/blog/why-it-is-important-to-understand-causality-and-agency/ ❤️ 4 ∀ https://github.com/nandodef/love4all-ai/tree/main/docs/files

4:11 AM · May 17, 2026

Cluster engagement

31 snapshots

ORIGINAL POST

#29Nando de Freitas@NANDODF

One line of code is all it takes to prevent LLM agent delusions, instead of post-training patches like RL.

https://love4all.ai/blog/why-it-is-important-to-understand-causality-and-agency/ ❤️ 4 ∀

github.com

/nandodef/love4all-ai/tree/main/docs/files

11:11 AM · May 17, 2026 · 15.3K Views

#29Nando de Freitas@NANDODF

The road here wasn’t easy. It started with our work on delusions with @AdaptiveAgents @ShaneLegg @scott_e_reed and many other bright scientists:

arxiv.org

/pdf/2110.10819

But instead of counterfactual learning, the theory of international imitation as a route to agency provided the foundation:

adaptiveagents.org

/universal_ai_as_imitation

The research was accelerated by @OpenAI GPT5.5 and Codex. When I ran out of Pro credits 😅 I switched to @AnthropicAI Claude. I wish there were special LLM licenses for academic work @gdb @sama @DarioAmodei 🙏

The bottleneck for research these days is computational resources/energy. I’m glad that startups like @cusp_ai are addressing the energy challenges.

This research was possible thanks to my @CIFAR_News fellowship - the 🇨🇦 gift that keeps on giving - and my adjunct/associated professorships @UBC_CS and @CompSciOxford

Nando de Freitas@NandoDF

11:11 AM · May 17, 2026 · 15.3K Views

11:29 AM · May 17, 2026 · 2.7K Views

#29Nando de Freitas@NANDODF

@AdaptiveAgents @ShaneLegg @scott_e_reed Typo: universal imitation, not international imitation 😅 🌌🌍

Nando de Freitas@NandoDF

The road here wasn’t easy. It started with our work on delusions with @AdaptiveAgents @ShaneLegg @scott_e_reed and many other bright scientists: https://arxiv.org/pdf/2110.10819 But instead of counterfactual learning, the theory of international imitation as a route to agency provided the foundation: https://adaptiveagents.org/universal_ai_as_imitation The research was accelerated by @OpenAI GPT5.5 and Codex. When I ran out of Pro credits 😅 I switched to @AnthropicAI Claude. I wish there were special LLM licenses for academic work @gdb @sama @DarioAmodei 🙏 The bottleneck for research these days is computational resources/energy. I’m glad that startups like @cusp_ai are addressing the energy challenges. This research was possible thanks to my @CIFAR_News fellowship - the 🇨🇦 gift that keeps on giving - and my adjunct/associated professorships @UBC_CS and @CompSciOxford

11:29 AM · May 17, 2026 · 2.7K Views

12:40 PM · May 17, 2026 · 1.5K Views

QUOTE POST