What @ClementDelangue points out I have seen in practice with large Fortune 500 clients. We are on track to saving one client ~$8 million dollars that was going to go to OpenAi and Anthropic that is now on local models. They were wasting so much before our audit.
We audit, they save… millions.
A study from @Stanford showed that 71.3% of chatgpt queries could be accurately answered by a local model. I suspect a major part of enterprise AI workloads could be run locally too for free (compared to the massive costs of frontier API cost).
Also, it reduces the risk of these workloads being taken away from you because you own the models instead of renting them - which sounds like a good idea these days haha.
That's why we're introducing the ability for everyone to filter AI models on @huggingface based on your local hardware.
For me, there are 800k+ public models that fit on my M5 24GB and that I can use easily thanks to llamacpp.
Let's go local AI!








