DeepSeek is the forever goat on ML infras.
If they haven't open source all the god-like AI infra projects like DeepSpec / DeepEP / DeepGEMM, we'll forever be puzzling on why they can achieve super low cost and super high throughput per stream yet still be profitable.
As I always said, close source model creators only know 1/100 of what his model is able to achieve. @nvidia probably just know 1/100 of what their chip can do too.
Thanks you @deepseek_ai and long live the vivid open source community - your hardware, your weights, your model!
> Another shocking data point is DSpark's TPOT, which is only 2.9-5.2ms, indicating that the built-in neural network layer in DSpark runs exceptionally fast. The latency introduced by DSpark can basically be ignored. DSpark is a small thing… but great engineering




