5h ago

T3 Stack creator Theo Browne challenges Code Arena Frontend benchmark validity after Alibaba's Qwen3.7-Max ranked fourth

The model reportedly matches Claude Opus 4.6 on agentic tasks.

β€”β€”0β€”β€”
Original post

Since we're talking about good code benches today, here's a shitty one for reference

4:02 PM Β· May 26, 2026 View on X

@theo are you trying to say muse spark isn’t better than 5.5?

are you feeling ok?

Theo - t3.ggTheo - t3.gg@theo

Since we're talking about good code benches today, here's a shitty one for reference

11:02 PM Β· May 26, 2026 Β· 40.3K Views
11:08 PM Β· May 26, 2026 Β· 1.6K Views
T3 Stack creator Theo Browne challenges Code Arena Frontend benchmark validity after Alibaba's Qwen3.7-Max ranked fourth Β· Digg