6h ago

Nando de Freitas unveils interventional SFT for agent models

0

Nando de Freitas, vice president of AI at Microsoft, outlined interventional SFT to stop agentic language models from self-confirming delusions. The method modifies supervised fine-tuning to ignore action tokens and train only on observations. Experiments with over 30 prompts yielded better factual selection than conventional approaches. Supporting code and examples appear on love4all.ai and GitHub.

Original post

One line of code is all it takes to prevent LLM agent delusions, instead of post-training patches like RL. https://love4all.ai/blog/why-it-is-important-to-understand-causality-and-agency/ ❤️ 4 ∀ https://github.com/nandodef/love4all-ai/tree/main/docs/files

4:11 AM · May 17, 2026 View on X

The road here wasn’t easy. It started with our work on delusions with @AdaptiveAgents @ShaneLegg @scott_e_reed and many other bright scientists:

arxiv.org
/pdf/2110.10819

But instead of counterfactual learning, the theory of international imitation as a route to agency provided the foundation:

adaptiveagents.org
/universal_ai_as_imitation

The research was accelerated by @OpenAI GPT5.5 and Codex. When I ran out of Pro credits 😅 I switched to @AnthropicAI Claude. I wish there were special LLM licenses for academic work @gdb @sama @DarioAmodei 🙏

The bottleneck for research these days is computational resources/energy. I’m glad that startups like @cusp_ai are addressing the energy challenges.

This research was possible thanks to my @CIFAR_News fellowship - the 🇨🇦 gift that keeps on giving - and my adjunct/associated professorships @UBC_CS and @CompSciOxford

Nando de FreitasNando de Freitas@NandoDF

One line of code is all it takes to prevent LLM agent delusions, instead of post-training patches like RL. https://love4all.ai/blog/why-it-is-important-to-understand-causality-and-agency/ ❤️ 4 ∀ https://github.com/nandodef/love4all-ai/tree/main/docs/files

11:11 AM · May 17, 2026 · 15.3K Views
11:29 AM · May 17, 2026 · 2.7K Views

@AdaptiveAgents @ShaneLegg @scott_e_reed Typo: universal imitation, not international imitation 😅 🌌🌍

Nando de FreitasNando de Freitas@NandoDF

The road here wasn’t easy. It started with our work on delusions with @AdaptiveAgents @ShaneLegg @scott_e_reed and many other bright scientists: https://arxiv.org/pdf/2110.10819 But instead of counterfactual learning, the theory of international imitation as a route to agency provided the foundation: https://adaptiveagents.org/universal_ai_as_imitation The research was accelerated by @OpenAI GPT5.5 and Codex. When I ran out of Pro credits 😅 I switched to @AnthropicAI Claude. I wish there were special LLM licenses for academic work @gdb @sama @DarioAmodei 🙏 The bottleneck for research these days is computational resources/energy. I’m glad that startups like @cusp_ai are addressing the energy challenges. This research was possible thanks to my @CIFAR_News fellowship - the 🇨🇦 gift that keeps on giving - and my adjunct/associated professorships @UBC_CS and @CompSciOxford

11:29 AM · May 17, 2026 · 2.7K Views
12:40 PM · May 17, 2026 · 1.5K Views

Very excited about this! Just fine-tune on the observation tokens and ignore the action ones to treat the agent's output as a causal intervention.

This is one of those moments when I'm surprised the maths works in practice 😅.

Nando de FreitasNando de Freitas@NandoDF

One line of code is all it takes to prevent LLM agent delusions, instead of post-training patches like RL. https://love4all.ai/blog/why-it-is-important-to-understand-causality-and-agency/ ❤️ 4 ∀ https://github.com/nandodef/love4all-ai/tree/main/docs/files

11:11 AM · May 17, 2026 · 15.3K Views
1:07 PM · May 17, 2026 · 1.9K Views

Very excited about this! Just fine-tune on the observation tokens and ignore the action ones to treat the agent's output as a causal intervention.

This is one of those moments where I'm surprised the maths works in practice 😅.

Nando de FreitasNando de Freitas@NandoDF

One line of code is all it takes to prevent LLM agent delusions, instead of post-training patches like RL. https://love4all.ai/blog/why-it-is-important-to-understand-causality-and-agency/ ❤️ 4 ∀ https://github.com/nandodef/love4all-ai/tree/main/docs/files

11:11 AM · May 17, 2026 · 15.3K Views
12:53 PM · May 17, 2026 · 79 Views
Nando de Freitas unveils interventional SFT for agent models · Digg