/Tech3h ago

GLM 5.2 secures second on Vending-Bench 2, outperforming GPT-5.5 in 350-day agent simulations

GLM 5.2 reached nearly $8,000 while GPT-5.5 trailed under $4,000.

519864322.5K

#501

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#501inTech

This is a remarkable graph that reinforces the same point as autoresearch results, WeirdML, "takeout delivery bench", and all other long multi-turn scenarios. Notice: *GLM-5.2 isn't any better than 5-5.1 for the first ≈150 days*. But it learns in-context. GPT-5 takes 300 days.