23h agoBleys Goodson finds MiniMax M3 achieved a 13.3% strict pass@1 rate on the DeepSWE software engineering benchmarkA 1.5x runtime extension raised the success rate to 16.8%SentimentSentimentPos66.7%Neg33.3%Some users thank evaluators for clarifying MiniMax M3's DeepSWE results and view its low regression rate as promising, while others dismiss the benchmark scores as weak and accuse the model of benchmaxxing despite frontier coding.17 comments with sentiment. View comments.