Will Brown says serving large Mixture of Experts models economically requires high batch sizes on 32 to 64 GPUs · Digg