19h ago

Weiran Yao launches CHI-Bench, a healthcare benchmark evaluating AI agents on long-horizon clinical workflows using 200 MCP tools

The benchmark provides process- and outcome-based reward signals.

0
Original post

Introducing CHI-Bench on @huggingface: the world’s first long-horizon healthcare benchmark for AI agents. 75 real healthcare workflows + 20 apps + 200+ MCP tools + 1,290 skills + process / outcome rewards https://huggingface.co/datasets/actava/chi-bench Any questions, lmk!

6:22 PM · May 25, 2026 View on X

💥💥💥

Weiran YaoWeiran Yao@iscreamnearby

Introducing CHI-Bench on @huggingface: the world’s first long-horizon healthcare benchmark for AI agents. 75 real healthcare workflows + 20 apps + 200+ MCP tools + 1,290 skills + process / outcome rewards https://huggingface.co/datasets/actava/chi-bench Any questions, lmk!

1:22 AM · May 26, 2026 · 10.7K Views
5:57 AM · May 26, 2026 · 1.3K Views