someone ought to post-train an LLM so that -- instead of having a single persona that talks to users or "thinks out loud" to itself -- there are multiple distinct RL-trained personas that coexist and interact verbally during routine operation
Most Activity
i have a half-baked theory that this could solve various alignment and capability problems
but also... it would just be *so cool*. right??
someone ought to post-train an LLM so that -- instead of having a single persona that talks to users or "thinks out loud" to itself -- there are multiple distinct RL-trained personas that coexist and interact verbally during routine operation