3d ago

Researchers introduce Persona Policies to evolve realistic LLM user simulators

Researchers introduced Persona Policies, a method that evolves LLM-based user simulators to add behaviors such as faltering, forgetting, and pushing back for AI agent training. Live human evaluations rated true human chats as human 80.0 percent of the time and Persona Policies chats 80.4 percent of the time, versus 46.5 percent for baseline τ² simulator chats across dozens of rated conversations.

0
Original post

We evolve user simulators so realistic, in live human evals people can’t tell the difference between them and real people! Whereas the baseline user simulator is identified as a bot ~half the time. If you need to train AI agents with more realistic users, check out our new paper Persona Policies (PPol)!

10:49 AM · May 16, 2026 View on X
Reposted by