12h agoOpus 4.8 sets a record score on BullshitBench, rebounding from version 4.7's decline in resisting sycophancy— The high scores mean the benchmark now requires harder questions——0——Original postLA#980@SCALING01OPPGPeter Gostev|@PETERGOSTEVTop notch result from Opus 4.8 on BullshitBench, after a slight dip with 4.7. Need to start thinking of some new harder questions soon!2:38 AM · May 29, 2026 View on X