/AI12h ago

Researchers Launch Ego-MC-Bench for AI Mistake Intervention

2964760

#932

Original post

Kosta Derpanis (sabbatical in Zurich)#932

Apratim Bhattacharyya@apratimbh

🚨Introducing: Ego-MC-Bench (Mistake Corrections) benchmark and Ego-CoMist (Counterfactual Mistakes) dataset.

🎯Ego-MC-Bench: Where AI assistants need to intervene at the right time (when) and with the right feedback (what) to prevent mistakes.

👉https://tinyurl.com/y7y9mwrs

1/4

1:42 PM · Jun 9, 2026 · 760 Views

/AI12h ago

Researchers Launch Ego-MC-Bench for AI Mistake Intervention

2964760

#932

Original post

Kosta Derpanis (sabbatical in Zurich)#932

Apratim Bhattacharyya@apratimbh

🚨Introducing: Ego-MC-Bench (Mistake Corrections) benchmark and Ego-CoMist (Counterfactual Mistakes) dataset.

🎯Ego-MC-Bench: Where AI assistants need to intervene at the right time (when) and with the right feedback (what) to prevent mistakes.

👉https://tinyurl.com/y7y9mwrs

1/4

1:42 PM · Jun 9, 2026 · 760 Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

Apratim Bhattacharyya@apratimbh

❗️Ego-MC-Bench contains instruction feedback pairs provided by an expert in real-world kitchen scenarios.

⏹️Current SOTA video LLMs show very poor mistake intervention capabilities.

👉Even Gemini-3-Flash manages to get an mistake intervention F1 score of only 0.18.

2/4

12h96

LIKES1

Apratim Bhattacharyya@apratimbh

🎯A major bottleneck is the lack of appropriate video data of procedural activities with mistakes.

This is in spite of abundance of procedural activity datasets.

Therefore, we propose a synthetic data generation process with counterfactual mistakes: Ego-CoMist.

3/4

12h771

REPLIES1

Apratim Bhattacharyya@apratimbh

👉This leads to a significant improvement in mistake intervention capabilities, especially for small models ideal for edge deployment.

📜Paper: https://arxiv.org/abs/2606.09547

4/4

12h55

Apratim Bhattacharyya@apratimbh

Joint work with: @Matewhs @SanjayHaresh @RolandMemisevic

12h36