Pedro A. Ortega, former Google DeepMind AI Safety Team lead now at daiosai, outlines a revealed-preference method to recover lower and certificate frontiers from black-box model behavior in a new arXiv paper
The work connects rate-distortion theory to generalization bounds and bounded rationality.
——0——
Paper: https://arxiv.org/pdf/2605.15340
Sparked by a conversation with @roydanroy on rate-distortion and generalization and his prior work with @gkdziugaite, which helped me to understand the connection to bounded rationality.
Following revealed-preference logic, the lower frontier and certificate frontier can be recovered from black-box behavior using loss scaling, operating paths, and local loss perturbations.
3:23 PM · May 19, 2026 · 188 Views
3:23 PM · May 19, 2026 · 1.1K Views