GPT is able to solve Erdos problems but still not come up with simple solutions on Interaction Net programming... I left 5.5 fixing a bug on a SupGen variant overnight and it failed. Obviously it did: the solution requires writing a HOAS interpreter on HVM, and doing so is physically impossible unless you have a key insight that allows it to go through. It is a very beautiful idea that completely reshapes how you think about the domain and the fix itself is 2-3 lines long. Of course, I just taught it the solution so it can fix the bug it had. But I wonder if it could find that on its own. I really with I could set up this experiment with whatever model solved Erdos. If it could rediscover my solution independently that would be one of the most shocking moments of my life
this is super cool but I still do not understand how they get a model to coherently and usefully reason for that amount tokens and at this point I'm to afraid to ask














