This entire situation should wake you up to how important open source models are
Get your own model running locally, go into debt if you have to
Matthew Berman and Garry Tan are pressing developers and users to shift toward running capable open-source models on their own hardware, citing the fragility of depending on any single vendor after high-profile models suddenly became inaccessible worldwide.
This entire situation should wake you up to how important open source models are
Get your own model running locally, go into debt if you have to
A recent export-control directive forced Anthropic to disable Fable 5 for everyone, turning a freshly launched frontier model into an overnight unavailable option and prompting fresh calls for local alternatives.
Hermes and OpenClaw are highlighted as ready-to-run open-source agents that keep memory, skills, and data on user machines while integrating with everyday apps, though hardware requirements and long-term maintenance remain user-dependent.
Positive users back local open source AI models for full control and avoiding vendor lock-in after bans like Fable 5, while negative users cite latency problems or dismiss the push outright.
Fable is banned. Long live local AI.
Full episode breaking down exactly how to get good at local models. the runtime, the hardware, quantization, connecting it to Hermes agent and local AI startup ideas (25 minutes)
The takeaway from Fable 5 being BANNED by the government: GET GOOD AT LOCAL MODELS SO YOU HAVE 100% CONTROL.
My entire weekend was going to be building my craziest ideas with Fable 5. That's now cancelled.
So instead of building with Fable this weekend, I've decided I'll go deep on local models:
1. Start with the runtime. Download Ollama or LM Studio first. This is the thing that actually runs models on your machine.
2. Match the model to your hardware. A model's size is measured in billions of parameters (7B, 32B, 70B). Bigger is smarter but needs more memory. Rule of thumb: a 7B model runs on almost any laptop, a 32B needs a good Mac with 32GB+ RAM, a 70B needs serious hardware like a DGX Spark or a maxed-out Mac Studio.
3. Know which model for which job. Qwen 3 is the best all-around choice for most tasks. DeepSeek for reasoning and coding. Gemma 4 when you need something tiny that runs on a phone. Llama when you want the biggest community and the most fine-tunes.
4. Quantization. You can shrink a model to run on weaker hardware with barely any quality loss. Look for versions labeled Q4 or Q5. This is how a model that "needs" a server runs on your laptop. Learning this one concept changes everything.
5. Connect it to your agent. Point Hermes or your agent stack at a local model.
6. Context window is your real constraint locally. Cloud models give you huge context for free. Local models make you pay for it in memory. A bigger context window eats RAM fast. Keep your sessions tight and your prompts lean or your machine chokes.
7. Learn to give local models tools. A smaller local model with web search, file access, and code execution beats a giant model with none. The capability gap closes fast when you wire up the right tools. The model is the engine but the tools are the wheels.
8. Fine-tuning is more accessible than you think. You don't need this on day one, but know it exists. You can take an open model and train it on your own data so it gets good at your specific domain.
I'll probably do a breakdown at some point on this @startupideaspod if people are into it.
The lesson from this ban is basically don't build your entire workflow on something that can disappear with a single letter. Own part of your stack. Local models are insurance.
It reminds me when people realized they don't own social media accounts. And then you saw people build email lists etc.
I remember running a startup and my biggest traffic source was organic FB. All of a sudden, algo changed, and I lost 99% of my traffic.
Same sorta moment (but bigger) for AI.
This is a wake up call.

@onekapisch It’ll happen. Just wait 6 months.

@MatthewBerman

@MatthewBerman But how can we even get closer to a model like Opus 4.8 locally?

@gregisenberg Eventully we will have our own Mythos/Fable locally, they can't stall the progress forever.

@gregisenberg Perfect timing Greg and totally agreed, this could end up being a real tipping point for local ai!

@gregisenberg Apologies for shutting down Fable

@gregisenberg Is it a revenge story?

@gregisenberg should be a good watch

@gregisenberg That's why we're building plug and play local inference devices with llms, agent harnesses, and local storage. Let's do this.

@gregisenberg Locally farmed models only

@dedene @MatthewBerman Impossible, you’ll never have Stargate level compute at home.

@LLON3RR full ep over here https://www.youtube.com/watch?v=bdhUBBACglw

@LifeOf_KB @MatthewBerman you need far more than that for any kind of reasonable replacement to fable or gpt-5.5 on the open weights side…
I’d argue shared services like ollama cloud, synthetic, etc etc running open weights on shared infrastructure is far cheaper AND more capable than a 128GB MBP

Quality and speed used to be the factors a year ago. Quality is almost frontier level, but speed will always be the local bottleneck cause in order to get frontier subscription level speed you need 4-8 Blackwell 6000s
1,000 tokens/second is sort of bare minimum if we're sincerely talking about replacing frontier model usage with local usage and not massively delaying productivity.

@GetCoffeeOnMe @gregisenberg I think more of a “Im gonna need another Coffee” kinda story

@benvargas @MatthewBerman Over time it will be just as expensive as getting a M5 MacBook Pro 128G.

@MatthewBerman Dude, I’ve been a follower from the early days. You have enough followers to not follow the herd. If I never hear “go in to debt if you have to” ever again, I’ll be happy. You have your followers, you’ve crafted a great brand. You can be yourself now 🙏

@kevin_smith51 @onekapisch @MatthewBerman i suspect you can achieve similar results if you're building and break the project into smaller well defined pieces. but if you're coming in with a top down approach, yes you need frontier.

@TheAhmadOsman This is your time to shine.