Microsoft releases BenchPress, predicting full LLM evaluation scores within 3.93 points using five probe benchmarks · Digg