Researchers introduce Persona Policies to evolve realistic LLM user simulators
Researchers introduced Persona Policies, a method that evolves LLM-based user simulators to add behaviors such as faltering, forgetting, and pushing back for AI agent training. Live human evaluations rated true human chats as human 80.0 percent of the time and Persona Policies chats 80.4 percent of the time, versus 46.5 percent for baseline τ² simulator chats across dozens of rated conversations.
——0——