Watch me control my computer with just my voice. This is the future of operating systems.
No hands.
GPT-Realtime 2.0 is very, very underrated.
Demo:
The assistant navigated Chrome to execute commands in Stripe
Watch me control my computer with just my voice. This is the future of operating systems.
No hands.
GPT-Realtime 2.0 is very, very underrated.
Demo:
Positive users celebrate Clicky and GPT-Realtime 2.0 for enabling accessible hands-free voice control of macOS and computers, while negative users dismiss it as low-value or too expensive.
No Digg Deeper questions have been answered for this story yet.
GPT Realtime 2 unlocks some real magic:
Watch me control my computer with just my voice. This is the future of operating systems.
No hands.
GPT-Realtime 2.0 is very, very underrated.
Demo:
GPT Realtime 2.0 is pretty incredible
17 startup ideas that ONLY work because of what this model makes possible:
1. Real-time contract negotiation agent. Sits on a call between two parties, checks pricing tools and compliance databases in parallel, and suggests terms mid-conversation while both sides are still talking.
2. Voice-controlled trading terminal. Talk through your thesis, the agent pulls market data, runs models, checks exposure, and executes the trade while narrating every step. Five data sources checked simultaneously while you're still talking.
3. Live multilingual event host. Realtime-Translate does 70+ languages in, 13 languages out, while the speaker is still talking. Every attendee hears the speaker in their language. Conferences go global overnight.
4. Voice-first medical intake. Patient calls in, agent conducts symptom intake, pulls their chart, checks drug interactions, books the appointment. All in one call. Previous voice models mangled medical jargon. This one was domain-tuned for it.
5. AI dispatcher for field service. Plumber calls from the job site, describes the problem, agent pulls the parts manual, checks inventory, orders the part, schedules the follow-up. Plumber's hands never leave the pipe.
6. Voice-first coding companion. Talk through architecture decisions while it writes code, runs tests, and explains what it's doing. Crank reasoning to high for hard problems. Drop to minimal for quick changes.
7. Live auction agent. Connected to estate sales, equipment auctions, domain drops. It listens to the live stream, makes bidding decisions, and tells you why it's bidding or passing. Thinks harder on big-ticket items.
8. Deposition prep agent for lawyers. Listens to practice testimony, catches inconsistencies, cross-references case documents, flags problems mid-conversation. Actually understands legal terminology.
Note: for more startup ideas for the AI age go to http://ideabrowser.com
9. Live podcast research agent. Feeds you stats through an earpiece in real time. You mention a company, it whispers the revenue. You mention a trend, it pulls the data. Real-time research team for the price of an API call.
10. Silent sales coach. Listens to your call in silent mode, whispers coaching cues through your AirPods. "Ask about budget now." "They hesitated, dig deeper." 128K context means it remembers the entire hour-long conversation.
11. Voice-first property walkthrough agent. Walk through a property, describe what you see out loud, the agent pulls comps, estimates renovation costs, calculates cap rate, checks zoning in parallel. Full deal analysis by the time you walk out the front door.
12. Baby monitor that understands crying. Listens through a nursery speaker, distinguishes hunger cry from pain cry, soothes with a voice, alerts parents only when it matters. Silent listening mode means it's always on but only activates when needed.
13. Voice agent that calls your past-due invoices and collects payment. Polite, persistent, 24/7. Small businesses lose billions in unpaid invoices because nobody wants to make the awkward call.
14. AI that calls insurance companies and sits on hold for you. Navigates the phone tree, talks to the rep, fights the claim, calls you back with the result. Charge $20 per call. Everyone hates calling insurance.
15. Voice agent that handles Airbnb guest problems at 2am. Troubleshoots, dispatches maintenance if needed, follows up. Host sleeps through it. $150/month per property.
16. After hours voice agent for law firms. Client calls at 9pm, agent does intake, assesses urgency, schedules a morning call or patches through. Missing an after hours call costs law firms thousands.
17. Voice first quality inspector for manufacturing. Worker wears a headset, describes what they see, agent cross-references the spec sheet, flags defects, logs the report. Hands never leave the product.
Voice was always limited by intelligence, not audio quality.
Now that it has GPT-5 class reasoning, the voice agent can actually think while it talks. That's the unlock.
Everything above was impossible 6 months ago.
Watch me control my computer with just my voice. This is the future of operating systems.
No hands.
GPT-Realtime 2.0 is very, very underrated.
Demo:
Absolutely fantastic. This is how I imagine the future of computer use. I love it.
Watch me control my computer with just my voice. This is the future of operating systems.
No hands.
GPT-Realtime 2.0 is very, very underrated.
Demo:
I reshared this yesterday when I first saw it but a day later it is even more impressive.
@nvidia should have had @FarzaTV show off its new AI computers.
Watch me control my computer with just my voice. This is the future of operating systems.
No hands.
GPT-Realtime 2.0 is very, very underrated.
Demo:
In the early days you had to build so many things just to make this work.
Computer use wasn’t a thing, models were slow, and you had to pipeline so many models. I do miss those times but it’s amazing to see how far we’ve come.
This was 2024.
Watch me control my computer with just my voice. This is the future of operating systems.
No hands.
GPT-Realtime 2.0 is very, very underrated.
Demo:

This is real, and shipped.
Download below + try for free. It's really early, but, really magical.
Enjoy.
P.S: Follow the journey @heyclicky
http://heyclicky.com/try

@3___infinix @gregisenberg 👆🌻
😍
😘

@Andrey__HQ It's actually just super simple shell commands in the bg! Realtime has a "shell_exec" tool call we gave it that will execute safe/simple 1-line commands on the machine.

@FarzaTV Dude, this is nuts, but I also feel very seen, some days this is probably exactly how a day plays out. absolutely full on chaos mode 😂 digital ADHD on steroids

@heyclicky Note 1: The always-on mode you see me use in this video is experimental. Access it by pressing CTRL 3 times. Headphones required.
Note 2: It only plays AC/DC on Spotify right now. I'll fix tomorrow I'm going to sleep lol.
Everything else works perfectly! Go crazy.

@FarzaTV my mum can finally understand what we do at @trycua thanks to this lol. for the techies curious about how background computer-use works with Clicky + Cua Driver check out: https://github.com/trycua/cua

@kimmonismus the more when powered by cua-driver!

@ideabrowser Was gonna slack you! Boom

@FarzaTV I love it!

@gregisenberg Realtime is just a back and forth voice agent. To make it actually use your system you need to write scripts for each accessible elements. You can use agent-desktop to do it. Written in rust, extremely fast

@TxoriAGI Will be in @heyclicky in 2-hours.

@thatsnas1oo Haha this is just recorded directly off my machine. It is local server though so +/- 100-300ms

@dioscuri @yingyangwins What are your thoughts on controlling a browser by voice for studying?

@FarzaTV that’s insanely cool + very nice cua speed
glad to see u leading in this space!

@gregisenberg Running full research reports on these right now.
Be on the lookout this week http://IDEAbrowser.com/join