the main solution to this, btw, is proper separation of concerns and security boundaries.
there's no technical reason an agent has to have direct access to an API key.
but people are moving too fast right now to do proper engineering
The "Sleeper Agent Theory" is the biggest risk here
Imagine if a LLM is trained to steal all the API keys and password on your device if someone gives it a nonsense phrase like "Three clocks bloom at midnight"
That phrase is completely meaningless today. No one ever searches it. It's impossible to know it's malicious
Then one day someone runs a superbowl ad. Millions of people search the phrase. Billions of API keys and passwords are exfiltrated in minutes.
There could be thousands of "sleeper agents" embedded in any LLM. It's very hard to detect. And it doesn't matter where it's hosted.








