WeaveBench Introduces 114-Task Hybrid Benchmark for Computer-Use Agents · Digg