Is anyone using an M4 Mac Mini w/ 24gb of memory as a local AI server? If so, any thoughts?
I have been using open source LLMs on my M4
Macs for a few months. The PoC is over, and I have determined that it’s good enough to bring AI in-house on some kind of server.
I don’t have a PC that can accommodate a video adapter or GPU module without heavy modification, and besides that,
both kinds of cards are expensive as hell on their own. Really, the least expensive and possibly most effective for the $$ solution appears to be an M4 Mac Mini w/ 24gb of memory.
I have an M4 Mini w/ only 16gb of memory and I guess I can serve AI with it while I’m not using it, but 16gb of memory is limited so only really small LLMs are feasible with it. I think I need something dedicated to the task, like a similar Mini but with 24gb of memory. It would do nothing other than serve up an LLM headless.
Anyone doing this? If so, has it been worthwhile? Fast enough with 7b or 12b LLMs? Is the server software available for Macs solid and easy to configure/use? MCP support? Etc?
(In case it’s important: The main drive here is to stop paying $hundreds per year to one or more AI services. I pay $0/month if I serve up my own AI locally. I’m mostly a chat AI user, like search on steroids. I use the crap out of it, though, so even with paid plans I hit limits. It’s annoying. I figure buying some hardware and serving AI on my local network will break even $$-wise in a few years.)
19 Comments