New RL Algorithm GCPO Boosts Token Credit Assignment In Generative Models · Digg