This new eval shows how you can use AI models from different providers to boost performance by 20%+ on a suite of technical problems involving logic, numeracy, geometry, calculus, statistics, and coding.
Simply put, the models can catch each other making mistakes. 1/🧵

