1d ago

Gemma 4 31B Tops Open-Weight Models on TERMS-Bench LLM Negotiation Benchmark

0
Original post

Honored to see Gemma 4 31B on TERMS-Bench, a benchmark for LLM negotiation agents based on economic negotiation! 🤝 - Environment verifies outcomes (no LLM-as-judge) - Top open-weight model alongside frontier peers - Allow diagnosing why and where agents fail

An image featuring a leaderboard table ranking 15 AI agents from various providers
9:32 AM · May 28, 2026 View on X
Reposted by