1d ago

OpenClaw scores 5.2 percent on ARC-AGI-3 benchmark

24395355145.9K

——0——

OpenClaw, an AI agent powered by Anthropic's Opus 4.7 model, scored 5.2 percent on the ARC-AGI-3 public demo set. The result earned the agent $2,900 and a spot on the community leaderboard. OpenClaw used long-term memory and code execution for the tasks. On ka59 it completed the first two levels before looping on the third and was halted after five times the actions needed by a human.

Original post

#984@SCALING01 @ARCPRIZE

ARC Prize@ARCPRIZE

ARC-AGI-3 Community Leaderboard OpenClaw, using Anthropic Opus 4.7, scores 5.2% ($2.9K) on ARC-AGI-3 Public Demo Set OpenClaw used long term memory and code execution Here OpenClaw is playing ka59, it solves the first 2 levels and then breaks down into a loop

9:53 AM · May 15, 2026

Cluster engagement

74 snapshots

Reposted by

#984@SCALING01

QUOTE POST

#1103Mike Knoop@MIKEKNOOP

New harness results for ARC v3 on the community leaderboard. We want to do more of these! There is a lot of harness innovation happening and the most general ideas will migrate to the model layer.

ARC Prize@arcprize

4:53 PM · May 15, 2026 · 23.1K Views

5:52 PM · May 15, 2026 · 1.7K Views

QUOTE POST

#1212Greg Kamradt@GREGKAMRADT

OpenClaw (via Opus 4.7) on ARC-AGI-3

It solves 2 levels of ka59 no problem, it clearly gets how to play the game. Then it gets caught in a loop and fails level 3

We cut it off after 5x human actions

ARC Prize@arcprize

4:53 PM · May 15, 2026 · 23.1K Views

5:36 PM · May 15, 2026 · 4.5K Views

#1212Greg Kamradt@GREGKAMRADT

Thank you to Bailey and @rob0the0nerd of Klaus helping us get this set up

Greg Kamradt@GregKamradt

OpenClaw (via Opus 4.7) on ARC-AGI-3 It solves 2 levels of ka59 no problem, it clearly gets how to play the game. Then it gets caught in a loop and fails level 3 We cut it off after 5x human actions

5:36 PM · May 15, 2026 · 4.5K Views

5:36 PM · May 15, 2026 · 688 Views