Today we're releasing Perceptron Agentic Detection: localize anything you can describe in natural language or show examples of.
Users praise Perceptron's Agentic Detection API for shifting computer vision from fixed labels to user-named natural language instructions that improve practical detection.
Most Activity
Excited to announce our Agentic Detection API - an API that delivers frontier performance on dense, ambiguous detection tasks.
Today we're releasing Perceptron Agentic Detection: localize anything you can describe in natural language or show examples of.

The agentic harness issues multiple Mk1 calls per request, so it can zoom in and take a second look at the small details one-shot detectors miss.

Mk1 runs inside an agentic harness with programmatic control over the image. It zooms, tiles, and crops on its own, which lets it localize hard objects and annotate scenes with thousands of instances.

One API, three ways to ask: - Detect everything: an exhaustive scene inventory, no pre-defined labels - Open-vocab categories: any class list, boxes or points - Visual exemplars: teach a class from a single example crop

Same endpoint works across many viewpoints and sensor types. What we’re building: - Wildfire origin detection from satellite imagery - Power-pole component inspection for utilities - Grasp and success detection for robots - Object grounding in LiDAR for AVs - Inventory counting in retail - Scene understanding on smart glasses

Available today. Built on Perceptron Mk1. $0.15/M input, $1.50/M output. Try it now: https://www.perceptron.inc/demo?mode=detect Read blog: https://perceptron.inc/blog/introducing-perceptron-agentic-detection

High-fidelity embodied reasoning often needs more than one look at a scene. Pointing precisely for a grasp, finding one object in a cluttered scene, reasoning across camera views: these are tasks where looking once isn’t always enough.

To address this, we built an agentic harness that issues multiple model calls per request. The model zooms in and takes a second look to resolve the small, dense, and ambiguous details one-shot detectors miss.

Try it: https://www.perceptron.inc/demo?mode=detect Reach out if you are interested in collaborating!

@perceptroninc Computer vision is quietly moving from labels to instructions. The useful part is not detecting everything it is letting users name the thing they actually care about