Today we're releasing Perceptron Agentic Detection: localize anything you can describe in natural language or show examples of.
Today we're releasing Perceptron Agentic Detection: localize anything you can describe in natural language or show examples of.
Excited to announce our Agentic Detection API - an API that delivers frontier performance on dense, ambiguous detection tasks.
Today we're releasing Perceptron Agentic Detection: localize anything you can describe in natural language or show examples of.
5/ Don't have a clean way to describe the object, use visual exemplars.
4/ It handles dense scenes the same way. We gave both models Brisbane Airport zero-shot. Single-pass detection has trouble at this density.

The agentic harness issues multiple Mk1 calls per request, so it can zoom in and take a second look at the small details one-shot detectors miss.

Mk1 runs inside an agentic harness with programmatic control over the image. It zooms, tiles, and crops on its own, which lets it localize hard objects and annotate scenes with thousands of instances.

One API, three ways to ask: - Detect everything: an exhaustive scene inventory, no pre-defined labels - Open-vocab categories: any class list, boxes or points - Visual exemplars: teach a class from a single example crop

Same endpoint works across many viewpoints and sensor types. What we’re building: - Wildfire origin detection from satellite imagery - Power-pole component inspection for utilities - Grasp and success detection for robots - Object grounding in LiDAR for AVs - Inventory counting in retail - Scene understanding on smart glasses

Available today. Built on Perceptron Mk1. $0.15/M input, $1.50/M output. Try it now: https://www.perceptron.inc/demo?mode=detect Read blog: https://perceptron.inc/blog/introducing-perceptron-agentic-detection

8/ We're not publishing that example. For now, we are keeping this category off the public API and are working with a trusted set of partners on this domain.

6/ Building this surprised us in one specific way. Once the model controls where it looks, it goes past detection: the loop reads what it finds and connects it to what it knows about the world, and behaviors show up that base model evals never indicated.

7/ The clearest case came from an internal eval on a long drone video over a conflict zone. The model zoomed on a small door sign, read it, understood what the sign implied about the building's contents (i.e., POL), and flagged the structure as an optimal target for the drone.

High-fidelity embodied reasoning often needs more than one look at a scene. Pointing precisely for a grasp, finding one object in a cluttered scene, reasoning across camera views: these are tasks where looking once isn’t always enough.

To address this, we built an agentic harness that issues multiple model calls per request. The model zooms in and takes a second look to resolve the small, dense, and ambiguous details one-shot detectors miss.

9/ Everything else is live today. Try it: http://perceptron.inc/demo?mode=detect Blog: http://perceptron.inc/blog/introducing-perceptron-agentic-detection

Try it: https://www.perceptron.inc/demo?mode=detect Reach out if you are interested in collaborating!
Today we're releasing Perceptron Agentic Detection: localize anything you can describe in natural language or show examples of.