🔥 AutoResearchClaw tech report + v0.5.0 just dropped.
12,300+⭐ on GitHub. Two big additions this release:
🧪 1/ Domain-Expert Agents in the experiment stage: Specialized agents for high-energy physics, biology, and more. Real domain tools + knowledge plugged in — not a generic LLM pretending to run experiments.
📊 2/ ARC-Bench
A 55-topic benchmark across ML, HEP, quantum physics, biology, and statistics. One of the broadest cross-disciplinary evaluations for autonomous research ever released.
🏆 The numbers:
→ Beats AI Scientist v2 by 54.7% on ARC-Bench
→ 7-mode HITL (human-in-the-loop) ablation: targeted intervention > full autonomy OR exhaustive oversight.
The thesis (still): real research isn't a pipeline. Hypotheses fail. Lessons compound. AutoResearchClaw is a research amplifier — not a paper generator.
📄 Tech report: https://arxiv.org/abs/2605.20025
💻 Code: https://github.com/aiming-lab/AutoResearchClaw
Thanks @itsJiaqiLiu and @StephenQS0710 who lead the work and all other contributors @HaonianJi, @lillianwei423, @XinyeYee, @richardxp888, @HaoqinT, @Xinyu2ML, @WeitongZhang, @jiahengzhang96, @LINJIEFUN, @linjunz_stat, @yuyinzhou_cs, @CaimingXiong, @james_y_zou, @ZhengBerkeley, @cihangxie, @dingmyu
