
12. The point is that AI isn't thinking deeply. It's not reading the literature, developing reasonable evaluation criteria, nor benchmarking normalization methods against it. It repeats the field's default and confidently justifies it. In this case, we know the answer. But what happens when we don't?



