/AI2h ago

xjdr, Entropix creator, says running disaggregated LLM inference requires a minimum setup of 12x8xB200 GPUs

Fine-tuning took six weeks on a GB300 NVL72 cluster

242041.7K

#609

Original post

xjdr@_xjdr#609inAI

not sure the minimum required but i can tell you i used a full gb300nvl72 for ft (took ~6 weeks) and now i am hosting it on the same cluster . the minimum i've used to run in real disagg for a meaningful number of tokens is 12x8xB200 (8 prefill x 4 decode) but ideally you have much more

vik@vikhyatk

@_xjdr how much infra do you need to finetune / host K2.6?

3:06 PM · Jun 9, 2026 · 1.1K Views

/AI2h ago

xjdr, Entropix creator, says running disaggregated LLM inference requires a minimum setup of 12x8xB200 GPUs

Fine-tuning took six weeks on a GB300 NVL72 cluster

242041.7K

#609

Original post

xjdr@_xjdr#609inAI

vik@vikhyatk

@_xjdr how much infra do you need to finetune / host K2.6?

3:06 PM · Jun 9, 2026 · 1.1K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

VIEWS373LIKES10REPLIES1

vik@vikhyatk

@_xjdr alrighty looks like i'm lowering my ambitions to laguna xs.2

xjdr@_xjdr

2h373100

xjdr@_xjdr

@vikhyatk its a beast. even in fp4, it takes quite a lot flop and quite a bit of HBM to train and run properly

vik@vikhyatk

@_xjdr alrighty looks like i'm lowering my ambitions to laguna xs.2

2h32670