
@tszzl there is this from mythos but i think they removed it for opus 4.8 saying it was saturated, the conclusion of this section is that it's still marginally better than prev model overall iirc but idk it's hard to judge we need more/better third party evals

