3.5 year breakeven
I have on good authority that GLM 5.2 is running at 120 tok/s across two networked Blackwell tinyboxes. $150k and that setup can be yours, either 2x tinybox or 1x tinybox pro. Never pay the cloud again.
A pseudonymous analyst is spotlighting a compact local rig built from two tinygrad Blackwell tinyboxes that supposedly delivers 120 tokens per second on GLM 5.2, framing the $150,000 purchase as a potential long-term alternative to renting cloud GPUs.
3.5 year breakeven
I have on good authority that GLM 5.2 is running at 120 tok/s across two networked Blackwell tinyboxes. $150k and that setup can be yours, either 2x tinybox or 1x tinybox pro. Never pay the cloud again.
The cited 3.5-year payback includes opportunity costs and power but rests on the unconfirmed 120 t/s speed and unspecified cloud pricing assumptions, leaving the actual timeline open to debate.
Each tinybox packs four RTX Pro 6000 Blackwell GPUs for 384 GB VRAM total, yet no public traces or third-party runs have surfaced to back the exact throughput number on this setup.
Some users are enthusiastic about GLM 5.2 speed on the $150K Tinybox and future compute incentives, while many others call the setup a poor investment due to high costs, rapid depreciation, and cheaper alternatives.
No Digg Deeper questions have been answered for this story yet.

@zephyr_z9 Should ANT/OAI really be worth 10x+ more than Z AI?
> Spend 150k$ immediately -> this is same as spending 7.5k$/year implicitly assuming 5~6% T-Bill. Close to 12k$ if you typically invest in S&P 500.
> 2$/hour electricity price (this is true in south korea, seoul, can be different for other part of the world). -> this is straight up 4.5k$ / year, if you work 40hours/week.
> insanely hot living room, occasionally electricity shortage, cost of installation (rent isnt free)
So You are looking at 12k~15k$ / year + your inconvenience.
But yes, it is true you get to never pay the cloud again (until GLM stop releasing models or tinybox breaks, which id assume happens in 10 years max)
I have on good authority that GLM 5.2 is running at 120 tok/s across two networked Blackwell tinyboxes. $150k and that setup can be yours, either 2x tinybox or 1x tinybox pro. Never pay the cloud again.

@__tinygrad__ hmm. I think Ill stick to the $20 monthly plan for now
I have on good authority that GLM 5.2 is running at 120 tok/s across two networked Blackwell tinyboxes. $150k and that setup can be yours, either 2x tinybox or 1x tinybox pro. Never pay the cloud again.

@__tinygrad__ $75k? Brother, we are not all Elon Musk here you know.

@__tinygrad__ Okay $150K.
GLM 5.2 API is about $4.40/m output $1.40 input.
I could hammer GLM 5.2 for ***years*** before that box paid itself off.
Unless someone ***needs*** private compute I have no idea who is buying these lol

@__tinygrad__ 75k rma might be the death of the local ai maxxer

@__tinygrad__ Only 150k guys

@zephyr_z9 pointless - it’ll be an outdated model by then. or running at cerebras

@__tinygrad__ No thanks, I'd rather not turn my apartment into a sauna

@__tinygrad__ I'll just wait a couple years till we get GLM 5.2 level models that can run on my macbook

@__tinygrad__ Is that 2 boxes with 4 RTX 6k each?

@__tinygrad__ how many instances can you inference in parallel on exabox

@__tinygrad__ the 'never pay the cloud' math only works at high utilization. $150k of hardware beats the cloud the moment you're running inference around the clock, and loses badly if you're not. owning vs renting compute is a bet on how constantly you'll actually use it.

@__tinygrad__ It will pay for itself only in 2.5 years vs 25 $200/mo Codex subscriptions (not accounting for electricity cost) No brainer!

@__tinygrad__ @PatrickToulme

@__tinygrad__ 150k???

@__tinygrad__ Well, maybe when I can afford that.

@__tinygrad__ have you guys taken a look at @c0mputeAI ??

@__tinygrad__ What quant and sequence length?