/Tech3h ago

Perceptron AI releases research formalizing visual reasoning, using the card game Set to show how spatial grounding improves reinforcement learning

Initial reasoning strategies were found to shape reinforcement learning outcomes.

9628306.9K

#72

Original post

Perceptron AI@perceptroninc

Reasoning in visual content remains an open problem - we are sharing our first study on formalizing visual reasoning. In this blog we walk through reasoning capabilities (backtracking and spatial grounding) in a case study of visual games (Set!).

10:34 AM · Jun 22, 2026 · 3.6K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS2.9KBOOKMARKS12LIKES21

Armen Aghajanyan@ArmenAgha

Starting to share deeper portions of our research. The reasoning strategy a base model starts with determines where RL ends up; for vision, grounding wins.

Perceptron AI@perceptroninc

3h2.9K2112

RETWEETS6

Jeremy Dohmann@jecdohmann

We taught a model to play Set! to explore which reasoning strategies make a model successful during RL. A base model's initial reasoning strategies determine the final outcome of RL and for visual problem solving, visually grounded reasoning is superior🧵

3h597206

REPLIES1

Lucas Beyer (bl16)@giffmana

@ArmenAgha One of my fav games :)

Armen Aghajanyan@ArmenAgha

Starting to share deeper portions of our research. The reasoning strategy a base model starts with determines where RL ends up; for vision, grounding wins.

1h46570

Jeremy Dohmann@jecdohmann

Backtracking/verification are completely essential for effective reasoning even under the SFT condition alone - what we also find is that grounded reasoning trains more stably for longer

3h271

Jeremy Dohmann@jecdohmann

Building off work by @gandhikanishk @noahdgoodman @achakravarthy01 et al, we expand into the visual domain by constructing reasoning chains to elicit desired reasoning capabilities: backtracking vs no backtracking, grounded (via bounding boxes) vs not grounded

3h261

Perceptron AI@perceptroninc

Read more here: https://www.perceptron.inc/blog/teaching-vlms-to-think-visually

3h49

Jeremy Dohmann@jecdohmann

read the blogpost here: https://www.perceptron.inc/blog/teaching-vlms-to-think-visually

3h31

Jeremy Dohmann@jecdohmann

We find that without grounding the most common failure on OOD board configurations is hallucinating cards that don’t exist. We also see that grounding shifts attention mass in the CoT and answer towards the image

3h19