Victor Taelin, Bend language creator, claims GPT models trail Mythos on compiler and proof language tasks
Technologist Jason proposed running benchmarks to target the performance gaps.
@VictorTaelin @BjarturTomas Is there something we could run evals in
@jxnlco @BjarturTomas I just wish GPT catches up with Mythos. It isn't quite there yet, there is a LOT to improve, and some stuff are taking longer to improve than I wish. If anything, I'd just like to help fix its bad outputs on compiler / proof lang work more directly than just waiting and hoping
@jxnlco @BjarturTomas Yes!! But I'd need you to tell me what kind of format would work best. I have only a rough idea, but there's a lot I could provide. For example, would a massive dataset of mined theorem/proof pairs work? How granular steps should be? Would making it interactive help? Etc.
@VictorTaelin @BjarturTomas Is there something we could run evals in