9h ago

Pedro A. Ortega, former Google DeepMind AI Safety Team lead now at daiosai, outlines a revealed-preference method to recover lower and certificate frontiers from black-box model behavior in a new arXiv paper

The work connects rate-distortion theory to generalization bounds and bounded rationality.

111041.2K

——0——

Original post

Following revealed-preference logic, the lower frontier and certificate frontier can be recovered from black-box behavior using loss scaling, operating paths, and local loss perturbations.

8:23 AM · May 19, 2026

Reposted by

#56@ROYDANROY

#1258Pedro A. Ortega@ADAPTIVEAGENTS

Paper: https://arxiv.org/pdf/2605.15340

Sparked by a conversation with @roydanroy on rate-distortion and generalization and his prior work with @gkdziugaite, which helped me to understand the connection to bounded rationality.

Pedro A. Ortega@AdaptiveAgents

Following revealed-preference logic, the lower frontier and certificate frontier can be recovered from black-box behavior using loss scaling, operating paths, and local loss perturbations.

3:23 PM · May 19, 2026 · 188 Views

3:23 PM · May 19, 2026 · 1.1K Views