Thermodynamic Sampling Units Deliver 100x Performance Per Watt for Diffusion Models
educating @gbrl_dick on thermoputers and telling him what workloads will go brrrr
.@beffjezos explains how thermodynamic computing fits into the current computing stack: "We don't do GPUs. They're called TSUs, thermodynamic sampling units. We think people are gonna put them next to GPUs or whatever their favorite accelerator is." "You don't need to run the whole workload on the TSU, but it could run certain parts of the workload, just like the GPU is complementary to the CPU initially." "The TSU is gonna help give you more performance per watt in your data center when combined with existing build-outs." "Let's say you're serving a large diffusion model. What we showed is that by using a TSU, you can use something called an energy-based model instead of a simple probabilistic kernel." "Overall you can have basically 100 times the performance per watt of just running on pure GPUs because our chips are much lower wattage."