1h ago

Qwen releases Qwen3.7-Max, its latest flagship model for agent workloads that achieves 69.7 on Terminal-Bench 2.0 and completed a 35-hour kernel optimization with over 1,000 tool calls

Supports multi-file coding agents, MCP integrations, and multi-agent orchestration.

0
Original post

📣Meet Qwen3.7-Max — our latest flagship, made for the Agent Era. A versatile foundation for agents that actually get things done: 🧑‍💻 Coding agent, end to end. Frontend prototypes, multi-file refactors, real debugging — nails it. 🗂️ A reliable office and productivity assistant. Get your work done through MCP integrations and multi-agent orchestration. ⏱️ Long-horizon autonomy. 35 hours straight on a kernel optimization task — 1,000+ tool calls, zero hand-holding. 🔌 Scaffold-agnostic. Claude Code, OpenClaw, Qwen Code, or your own stack. Consistent reliability everywhere. API's up on Alibaba Model Studio. You can also take it for a spin on Qwen Studio. Go build something wild!🏃🏃‍♂️ 📖 Blog: https://qwen.ai/blog?id=qwen3.7 ✅ Qwen Studio: https://chat.qwen.ai/?models=qwen3.7-max ⚡️ API:https://modelstudio.console.alibabacloud.com/ap-southeast-1?tab=doc#/doc/?type=model&url=2840914_2&modelId=qwen3.7-max&serviceSite=international

6:15 AM · May 21, 2026 View on X

Alibaba released Qwen 3.7 max. Benchmarks incredible.

Their new model ran autonomously for 35 hours, made 1,158 tool calls, and achieved a 10x speedup - on a single attention kernel.

This isn't "AI improving itself across the board." It's a model grinding through compile-profile-rewrite loops on one well-defined optimization target.

Impressive? Absolutely. The kind of self-improvement people will imagine when they see the headline? Not yet.

The actually interesting claim is buried deeper: Qwen says agentic capabilities generalize from diverse training environments the same way language capabilities generalize from diverse text. If that holds, it's a bigger deal than any benchmark number.

QwenQwen@Alibaba_Qwen

📣Meet Qwen3.7-Max — our latest flagship, made for the Agent Era. A versatile foundation for agents that actually get things done: 🧑‍💻 Coding agent, end to end. Frontend prototypes, multi-file refactors, real debugging — nails it. 🗂️ A reliable office and productivity assistant. Get your work done through MCP integrations and multi-agent orchestration. ⏱️ Long-horizon autonomy. 35 hours straight on a kernel optimization task — 1,000+ tool calls, zero hand-holding. 🔌 Scaffold-agnostic. Claude Code, OpenClaw, Qwen Code, or your own stack. Consistent reliability everywhere. API's up on Alibaba Model Studio. You can also take it for a spin on Qwen Studio. Go build something wild!🏃🏃‍♂️ 📖 Blog: https://qwen.ai/blog?id=qwen3.7 ✅ Qwen Studio: https://chat.qwen.ai/?models=qwen3.7-max ⚡️ API:https://modelstudio.console.alibabacloud.com/ap-southeast-1?tab=doc#/doc/?type=model&url=2840914_2&modelId=qwen3.7-max&serviceSite=international

1:15 PM · May 21, 2026 · 43.9K Views
1:31 PM · May 21, 2026 · 5K Views
Qwen releases Qwen3.7-Max, its latest flagship model for agent workloads that achieves 69.7 on Terminal-Bench 2.0 and completed a 35-hour kernel optimization with over 1,000 tool calls · Digg