Hmmm... Interesting
中昊芯英 unveils next gen TPU AI chip as part of 泰则 2.0 compute cluster that will have 896 TFLOPS compute per chip (1792 TOPS INT8). Each chip uses just 600W of power.
In a 8 TPU + 2 CPU setup, box will have 7.168 PFLOPS of compute. Natively supports PyTorch, vLLM, SGLang & other tools.
Completed integration w/ Qwen, DeepSeek, GLM & Minimax.
Company uses Chiplet + 2.5D packaging & connected in a cluster via optical modules & w/ OCS.
Founder/CEO 杨龚轶凡 was a core member of Google TPU design team for v2/3/4.
This does feel like the Chinese Google TPU.