Tanishq Mathew Abraham shares benchmarks showing Claude Sonnet 5 beats Sonnet 4.6 across evaluations but trails Opus 4.8 on SWE-bench Pro · Digg