/AI7h ago

Qwen3.6-35B Model Hits 195 Tokens Per Second On Dual Blackwell GPUs

1513789

#253

Original post

Julien Chaumond#253

KALALA NZENIELE@cniongolo

I’m not sure people realize yet that you can actually run Qwen3.6-35B-A3B-Claude-4.7-Opus-abliterated-MTP-GGUF on a dual‑GPU setup with Nvidia RTX PRO 6000 Blackwell cards and still hit around 195 tokens per second on @huggingface with huggingface Inference ! Already tried it with the pi agent!

@julien_c 🤩