I solved an open problem in optimization with ChatGPT Pro.
Consider min-max problems, where we have to minimize with respect to some variable and maximize with respect to other variables to find a saddle-point.
I solved an open problem in optimization with ChatGPT Pro.
Consider min-max problems, where we have to minimize with respect to some variable and maximize with respect to other variables to find a saddle-point.
Users praise the cool result of a researcher solving an open min-max optimization problem using ChatGPT Pro.
If the problem is convex in the first variable and concave in the second one, we can use two online learning algorithms playing against each other. It is easy to show (e.g., see my online learning book) that the average of the iterates will converge to the saddle point.
I solved an open problem in optimization with ChatGPT Pro.
Consider min-max problems, where we have to minimize with respect to some variable and maximize with respect to other variables to find a saddle-point.
ChatGPT 5.5 Pro found a way to use one of my partial results to complete the proof, a highly non-trivial one.
The full pdf, including my prompting strategies, is here: https://arxiv.org/pdf/2606.11773
Please let me know what you think.
@SebastienBubeck, you might also like it :)
So, I tried ChatGPT.
But ChatGPT 5.5 Pro failed, no matter what I tried.
So, I used a different strategy: I wrote a pdf with everything I knew about this problem, including results that were not immediately useful, and told ChatGPT to complete the proof. This time it worked!
So, I tried ChatGPT.
But ChatGPT 5.5 Pro failed, no matter what I tried.
So, I used a different strategy: I wrote a pdf with everything I knew about this problem, including results that were not immediately useful, and told ChatGPT to complete the proof. This time it worked!
The difficulty is that the entropy function of the multiplicative updates is not differentiable on the boundary of the simplex, so certain limit operations fail
Intuitively, the Bregman is infinitely steep on the boundary, hence the iterates "slow down" even without convergence
This is my second LLM-assisted proof. In both cases, the LLMs failed at first.
I believe this suggests the need for specific harnesses or prompting strategies that encourage exploring ways that are farther than 1-step away.
As promised, we put on Arxiv the proof we did with Gemini. https://arxiv.org/pdf/2505.20219
This shows that the Polyak stepsize not only will not reach the optimum, but it can cycle, when used without the knowledge of f*.
Gemini failed when prompted directly ("Find an example where the best and average iterate do not converge"), but it worked when I gave more specific instructions ("Find a function and an initial point where it generates a cycle of length 3 and none of the iterates nor their average converge to the minimum").
As you can see, the proof is not difficult, but it is very creative: Rewriting the update with trigonometric functions and using their doubling formulas to show the cycle is not something I would have thought!

However, one can do better if the problem is smooth (i.e., the gradient is Lipschitz). In this case, using two optimistic online learning algorithms, the convergence will be faster. Moreover, if you use optimistic online gradient descent/ascent, the last iterate will converge

This is interesting because it means that the actual plays of the two players will converge to the optimal play, rather than the average of their plays. Now, what happens if we use optimistic multiplicative weights updates?

The difficulty is that the entropy function of the multiplicative updates is not differentiable on the boundary of the simplex, so certain limit operations fail
Intuitively, the Bregman is infinitely steep on the boundary, hence the iterates "slow down" even without convergence

One can still show that the convergence will be fast, yet it was unknown if the last iterate would converge too. People proved all sorts of partial results on this problem, but a general answer was still missing. I also tried to solve it on and off in the past few years.

@bremen79 Very cool result @bremen79!