PSA: Just added a few thousand chips, including B200s and B300s to our Dedicated Model Inference (http://api.together.ai/endpoints). With Dedicated Model Inference, you can now on-click deploy our Blackwell optimized inference engine with auto-scale on frontier OSS models including Nemotron, Minimax, Kimi, DeepSeek, GLM and Qwen.
Together AI Adds Thousands of Blackwell B200 and B300 Chips for Model Inference
Positive users praise Together AI's addition of thousands of Blackwell B200 and B300 chips as a massive upgrade, while negative users worry about the unaffordable prices and high bills.
Most Activity

@vipulved still waiting for the day these chip posts come with a price tag that doesnt make me choke on my coffee
PSA: Just added a few thousand chips, including B200s and B300s to our Dedicated Model Inference (http://api.together.ai/endpoints). With Dedicated Model Inference, you can now on-click deploy our Blackwell optimized inference engine with auto-scale on frontier OSS models including Nemotron, Minimax, Kimi, DeepSeek, GLM and Qwen.

@vipulved realizing dedicated inference is like picking lanes now
cool move tho, whos spinning up first on this

@vipulved that is a massive upgrade for the stack, nice work.

@vipulved actual b200/b300 availability is half the battle right now, everyone quotes capacity you can't actually rent. is dedicated inference mostly people serving their own fine-tunes, or folks escaping shared endpoint limits?

@vipulved autoscaling frontier oss models is where deployment gets real

@vipulved Auto-scaling frontier models: the cloud promise that brings every ops team together. When the bill arrives.

@vipulved @togethercompute Zabardast 👍👍👍