PSA: Just added a few thousand chips, including B200s and B300s to our Dedicated Model Inference (http://api.together.ai/endpoints). With Dedicated Model Inference, you can now on-click deploy our Blackwell optimized inference engine with auto-scale on frontier OSS models including Nemotron, Minimax, Kimi, DeepSeek, GLM and Qwen.
Together AI co-founder Vipul Ved Prakash says the platform has deployed thousands of NVIDIA Blackwell chips for dedicated inference
The upgrade includes automated scaling for frontier open-source models.
Positive users praise Together AI's addition of thousands of Blackwell B200 and B300 chips as a massive upgrade, while negative users worry about the unaffordable prices and high bills.
Most Activity

@vipulved still waiting for the day these chip posts come with a price tag that doesnt make me choke on my coffee
PSA: Just added a few thousand chips, including B200s and B300s to our Dedicated Model Inference (http://api.together.ai/endpoints). With Dedicated Model Inference, you can now on-click deploy our Blackwell optimized inference engine with auto-scale on frontier OSS models including Nemotron, Minimax, Kimi, DeepSeek, GLM and Qwen.

@vipulved actual b200/b300 availability is half the battle right now, everyone quotes capacity you can't actually rent. is dedicated inference mostly people serving their own fine-tunes, or folks escaping shared endpoint limits?

@vipulved that is a massive upgrade for the stack, nice work.

@vipulved realizing dedicated inference is like picking lanes now
cool move tho, whos spinning up first on this

@vipulved autoscaling frontier oss models is where deployment gets real

@vipulved Auto-scaling frontier models: the cloud promise that brings every ops team together. When the bill arrives.

@vipulved @togethercompute Zabardast 👍👍👍