@kalomaze 3.1 pro is big model tho?
kalomaze@kalomaze
i am trying to work on the closest thing possible to a true "big model smell" eval which is to say: something that measures something that clever post training can't trivially gap, and is cheap + topically diverse i can't test mythos for obvious reasons, but... hmm...
6:34 PM · Jun 14, 2026 · 1.7K Views
