Yoav Goldberg notes that inventive elements of mathematical reasoning such as creating new concepts remain difficult to verify mechanistically in AI systems
Luca Ambrogioni replies that AI relies solely on natural text generation.
@LucaAmb in training it is
@yoavgo Keep in mind that right now things are not verifyed mechanicistically anyway. It's natural text
@LucaAmb are you sure? i'd be surprised if it was only demonstration based sft
@yoavgo Training uses human annotated data, it's not formal verification
@LucaAmb so they are trained only on sft from positive examples? how would that work?
@yoavgo The proofs are not formalized in Lean, so I do not see how they could have used formal validation Of course they can use LLM validation as proxy, but that's usable in any setting
@LucaAmb without any negative example? that is super surprising if it works
@yoavgo As far as I know, they are relying on highly curated reasoning chains from mathematicians The miracle of fine tuning
@LucaAmb the result was obtained using natural language. but i suspect the training pipeline had some non-nl components, even if they were hidden from the model. but cannot know for sure of course
I don't know if they use negative examples. I am just sure that the can't use formal validation since the output wasn't in Lean, it was a human verified natural language paper Formalization is a long term goal for them, but it didn't drive progress so far. The driver of progress was high quality human feedback
@LucaAmb i did not say formal, i said verification. at a minimum like other RLVR approaches. let it prove known-to-be-true things as well as known-to-be-false things. do it also for intermediary steps, with a curriculum
@yoavgo Then what form of formal verification do you have in mind?
@yoavgo Keep in mind that right now things are not verifyed mechanicistically anyway. It's natural text
by weird and annoying complete coincidence, these also happens to be parts which are very hard to verify mechanistically
@yoavgo Training uses human annotated data, it's not formal verification
@LucaAmb in training it is
@yoavgo The proofs are not formalized in Lean, so I do not see how they could have used formal validation
Of course they can use LLM validation as proxy, but that's usable in any setting
@LucaAmb are you sure? i'd be surprised if it was only demonstration based sft
@yoavgo As far as I know, they are relying on highly curated reasoning chains from mathematicians
The miracle of fine tuning
@LucaAmb so they are trained only on sft from positive examples? how would that work?
I don't know if they use negative examples. I am just sure that the can't use formal validation since the output wasn't in Lean, it was a human verified natural language paper
Formalization is a long term goal for them, but it didn't drive progress so far. The driver of progress was high quality human feedback
@LucaAmb without any negative example? that is super surprising if it works
I highly doubt it. The structure of a formal Lean proof is very different and it will likely not provide much help to the non-formal task
In general, I am not aware on any major math results first proven in a computer verifiable form. They are always proven in natural language first and then formalized after the facts
@LucaAmb the result was obtained using natural language. but i suspect the training pipeline had some non-nl components, even if they were hidden from the model. but cannot know for sure of course
@yoavgo Then what form of formal verification do you have in mind?
@LucaAmb i did not say they used lean..
@yoavgo Oh ok, then I misunderstood what you meant. Can you explain to me what you mean by verification in this context?
@LucaAmb i did not say formal, i said verification. at a minimum like other RLVR approaches. let it prove known-to-be-true things as well as known-to-be-false things. do it also for intermediary steps, with a curriculum