Microsoft AI researchers release paper on training a coding LLM from scratch using reinforcement learning and hillclimbing · Digg