Austin's account of an AI scheduling startup reveals it required 75 human operators to handle complex edge cases

VIEWS53.1KBOOKMARKS123LIKES126RETWEETS7REPLIES7

The long slog to take humans out of the loop… is a brutally long slog worth persevering through!

The Certifiably Insane Way to Build an AI Agent:

1. choose a category where mistake tolerance is roughly the same as it is in self-driving cars. we chose "email-based scheduling assistant." many people want this product, but they immediately fire him if he screws up an interaction with a prospect, a candidate, or a potential investor

2. you learn that the edge cases are too complex and too frequent to be solvable. ours: managing timezones for people who travel (and change travel plans) constantly. knowing when NOT to respond, when to text the customer on the side to verify something, when to follow up, which sub-calendar to use, when to bend the rules on availability, when we can schedule that one type of call during your commute but not the other type of call. sharing your availabilities without compromising your privacy. and on and on.

3. the product doesn't feel viable, but you don't want to give up. you spend hours in a hot tub in Marin with a friend who makes self-driving cars. you make a plan to do it the way they did: hold the steering wheel. you go home and build a human-in-the-loop platform and hire contractors to serve as a backstop and catch mistakes before they happen (and to help design a map of what a world-class EA would do in every weird scenario). you decide trust is the currency in your category, so it must be the thing you won't compromise on. the product must succeed at any scheduling request, no matter how complicated.

4. you instantly feel an overwhelming market pull. so you keep going, growing that team to 75 people working 24/7 to support the nonstop scheduling needs of your customers. tons of engineering time goes to scaling the human platform instead of building the product.

5. you try to raise a Series A and investors say you are insane. your gross margins are extremely negative. they believe this is a problem worth solving, but they don't believe it is as hard to solve as you say. they want AI, not humans. your competitors put "NO HUMANS IN THE LOOP" on their landing pages to call you out. you keep going.

6. you work day and night building the harness that can meet the quality standard your customers have come to expect. you create a massive synthetic gold dataset. audit it, and clean it, label it. repeat. then, experiments. fine-tuning. RL. ACE. DSPy. sub-agents. sub agents for your sub-agents. rebuild the harness. throw more tokens at the problem.

7. some weeks you make big progress. some weeks your evals climb a single basis point, but that's better than nothing. more experiments. more tokens.

john coogan said the hot trend in 2026 will be dogged pursuits. that pushes you to continue the pursuit, doggedly.

8. then, one day, you realize you are scheduling thousands of meetings a day and approaching 50% autopilot with no increase in churn or complaints. you put 150 customers in a full self-driving experiment, and they use the product MORE than they were using it when they had the human backstop. you can really start to let go of the steering wheel.

9. you don't know yet if this was a hill worth climbing, but you are nonetheless stoked that you can see the top. you have created a proprietary map of what to do in a million different situations. nobody else has that map, and the models keep getting better at following maps. your plan was to bet on trust, and your product can be trusted.

today was the first day Howie crossed 50% autopilot:

7h53.1K126123

Alex Cohen@anothercohen

I think so many people underestimate how hard the most obvious problems to solve in AI actually are

austin petersmith@awwstn