16h agoInference Backend Choice Shifts LLM Benchmark Scores by 16.6 Points——0——Original postPM#713@PMINERVINIOPJKJean Kaddour|@JEANKADDOUR> the choice of [inference] backend alone can shift benchmark scores by up to 16.6 percentage points and induce high rates of output disagreement curious how many LLM RL methods would replicate across inference backends5:11 AM · May 25, 2026 View on X