1d ago

Chenhao Tan and FAR.AI launch MechEvalAgent to detect implicit hallucinations in mechanistic interpretability research

It flags when an agent's research claims contradict its code.

Chenhao Tan and FAR.AI launch MechEvalAgent to detect implicit hallucinations in mechanistic interpretability research · Digg