Final exam at Fudan: students don't answer questions. They write them — to stump AI. 51 students, 10 questions each, 3 AI models (Claude, DeepSeek, MiniMax) on the hot seat. The harder you make AI fail, the higher your grade.
Fudan University graded students on how effectively their final exam questions could make Claude, DeepSeek, and MiniMax fail
Teortaxes proposed AI labs buy this student-generated evaluation data.
Many users called the Fudan students' exam questions designed to stump AI models an interesting and valuable way to expose weaknesses or build training data, while one criticized the lack of standardization.
No Digg Deeper questions have been answered for this story yet.
Most Activity

@FudanUniversity lol grading on "how hard did ai choke" instead of right/wrong answers. that's genuinely hilarious and probably reveals more about model weaknesses than any benchmark
start paying them, idiots these millions of students can generate you all the "Ph.D level data" you need, just build the ecosystem for selling this labor to labs
Final exam at Fudan: students don't answer questions. They write them — to stump AI. 51 students, 10 questions each, 3 AI models (Claude, DeepSeek, MiniMax) on the hot seat. The harder you make AI fail, the higher your grade.

@FudanUniversity Interesting way to generate a training set 🤓

@FudanUniversity new way to test.
Bro, you are Fudan University, get your account verified.
@teortaxesTex - China has such a massive student / researcher population - they can produce the kind and volume of data impossible for Turing, Mercors of the world.
China's advantage is going to be high quality data. Mark my words.

@FudanUniversity 想法好

@FudanUniversity 这真的是复旦官方号吗,看着有点不像

@FudanUniversity That’s a good idea 👍

@FudanUniversity 但话又说回来了

@FudanUniversity 怎麼用的 claude?不是不讓翻牆嗎?

@FudanUniversity 没有合理的标准化

@FudanUniversity interesting