/AI1d ago

Developer Builds Fully Local AI Agent on RTX 5090 with Telegram Access

--0--
Original posts
Reposts
Original postTeknium 馃#259
witcheer@witcheer

I spent today turning a blank Hermes install into a sovereign local agent on my RTX 5090. always-on, reachable over telegram, fully local. here's everything it can do.

it runs Qwen3.6-27B on the 5090 and answers from my phone over telegram.

~~~ I first made it faster. switched to the MTP build for self-speculative decoding: 62 > 115 tokens/sec.

~~~ it now benchmarks models on command. I ask it to speed-test a gguf and it stops its own model server to free the gpu, runs llama-bench, renders a card, sends it to my phone, then restarts itself. it's never left offline.

~~~ it also knows me very well. I pointed it to my private HF repo where I store all my AI traces since I started using it. I created a local markdown memory vault, seeded from my own history, with semantic search over it.

~~~ the whole stack stays home: weights on disk, inference on the 5090, the agent in my pocket.

here is the video it created to showcase everything it has now.

7:14 AM 路 Jun 1, 2026 路 9.1K Views
Sentiment
Sentiment unavailable for this story.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most ActivityTimeline
No ranked X posts are available for this story yet.
Developer Builds Fully Local AI Agent on RTX 5090 with Telegram Access 路 Digg