/AI14h ago

Researcher Urges Continuous Coordinates Over Discrete Tokens For Bounding Box Prediction

--0--
Original post
Ethan@torchcompiled

Coordinates were not meant to be discrete, you risk class imbalance and no intrinsic knowledge of number line or a loss that caters to that, just use a linear head to output 4 coordinates with an MSE loss. Fourier embed at input side. Here’s a paper we did on that

NVIDIA AI@NVIDIAAI

This #CVPR2026 paper from our research team is trending #1 on @HuggingFace 🤗

Meet LocateAnything: a vision-language detection model that rethinks bounding box prediction. For AI agents and robots, “seeing” is only useful if a model can pinpoint where something is fast enough to act.

Trained on 138M high-quality samples, LocateAnything decodes bounding boxes in parallel instead of one coordinate at a time, improving localization accuracy while dramatically increasing throughput for visual grounding and detection.

Project page: https://nvda.ws/4dKSohb

10:41 PM · May 28, 2026 · 5K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
-
Views
-
Comments
-
Reposts
-
Bookmarks
Expand data
Posts from X
Most Activity
Most Activity
RETWEETS4
Ethan@torchcompiled

Coordinates were not meant to be discrete, you risk class imbalance and no intrinsic knowledge of number line or a loss that caters to that, just use a linear head to output 4 coordinates with an MSE loss. Fourier embed at input side. Here’s a paper we did on that

NVIDIA AI@NVIDIAAI

This #CVPR2026 paper from our research team is trending #1 on @HuggingFace 🤗

Meet LocateAnything: a vision-language detection model that rethinks bounding box prediction. For AI agents and robots, “seeing” is only useful if a model can pinpoint where something is fast enough to act.

Trained on 138M high-quality samples, LocateAnything decodes bounding boxes in parallel instead of one coordinate at a time, improving localization accuracy while dramatically increasing throughput for visual grounding and detection.

Project page: https://nvda.ws/4dKSohb

7dViews 5KLikes 45Bookmarks 21