To what extent do AI-generated papers contain fabrications?
🚀Excited to introduce FabScore for fine-grained evaluation of fabrications in automated AI research. 🧵
We evaluate 144 AI-written papers from multiple sources, including @SakanaAILabs 's AI Scientist, MLR-Bench, @AnalemmaAI 's FARS and the 2025 #Agents4Science Open Conference.
Among 54 real conference submissions, we find that approximately 70% contain at least one fabrication; even among accepted papers, the rate remains as high as 59.3%.
📰 Paper: https://chchenhui.github.io/papers/FabScore.pdf
💻 Code: https://github.com/chchenhui/fabscore
1/