1d ago

OpenClaw scores 5.2 percent on ARC-AGI-3 benchmark

0

OpenClaw, an AI agent powered by Anthropic's Opus 4.7 model, scored 5.2 percent on the ARC-AGI-3 public demo set. The result earned the agent $2,900 and a spot on the community leaderboard. OpenClaw used long-term memory and code execution for the tasks. On ka59 it completed the first two levels before looping on the third and was halted after five times the actions needed by a human.

Original post

ARC-AGI-3 Community Leaderboard OpenClaw, using Anthropic Opus 4.7, scores 5.2% ($2.9K) on ARC-AGI-3 Public Demo Set OpenClaw used long term memory and code execution Here OpenClaw is playing ka59, it solves the first 2 levels and then breaks down into a loop

9:53 AM · May 15, 2026 View on X
Reposted by

New harness results for ARC v3 on the community leaderboard. We want to do more of these! There is a lot of harness innovation happening and the most general ideas will migrate to the model layer.

ARC PrizeARC Prize@arcprize

ARC-AGI-3 Community Leaderboard OpenClaw, using Anthropic Opus 4.7, scores 5.2% ($2.9K) on ARC-AGI-3 Public Demo Set OpenClaw used long term memory and code execution Here OpenClaw is playing ka59, it solves the first 2 levels and then breaks down into a loop

4:53 PM · May 15, 2026 · 23.1K Views
5:52 PM · May 15, 2026 · 1.7K Views

OpenClaw (via Opus 4.7) on ARC-AGI-3

It solves 2 levels of ka59 no problem, it clearly gets how to play the game. Then it gets caught in a loop and fails level 3

We cut it off after 5x human actions

ARC PrizeARC Prize@arcprize

ARC-AGI-3 Community Leaderboard OpenClaw, using Anthropic Opus 4.7, scores 5.2% ($2.9K) on ARC-AGI-3 Public Demo Set OpenClaw used long term memory and code execution Here OpenClaw is playing ka59, it solves the first 2 levels and then breaks down into a loop

4:53 PM · May 15, 2026 · 23.1K Views
5:36 PM · May 15, 2026 · 4.5K Views

Thank you to Bailey and @rob0the0nerd of Klaus helping us get this set up

Greg KamradtGreg Kamradt@GregKamradt

OpenClaw (via Opus 4.7) on ARC-AGI-3 It solves 2 levels of ka59 no problem, it clearly gets how to play the game. Then it gets caught in a loop and fails level 3 We cut it off after 5x human actions

5:36 PM · May 15, 2026 · 4.5K Views
5:36 PM · May 15, 2026 · 688 Views
OpenClaw scores 5.2 percent on ARC-AGI-3 benchmark · Digg