not sure the minimum required but i can tell you i used a full gb300nvl72 for ft (took ~6 weeks) and now i am hosting it on the same cluster . the minimum i've used to run in real disagg for a meaningful number of tokens is 12x8xB200 (8 prefill x 4 decode) but ideally you have much more
@_xjdr how much infra do you need to finetune / host K2.6?