11h agoZ.ai deploys ZCube network architecture for prefill-decode disaggregated LLM inference clusters, cutting switch and optical module CapEx by 33 percent and raising GPU throughput 15 percent.— P99 TTFT latency fell 40.6 percent with the flattened topology.——0——Original postBZ#1158@BANGHUAZOPZ.Z.ai|@ZAI_ORGhttp://x.com/i/article/20572069232088842242:47 PM · May 20, 2026 View on X