Divergences Induce Distinct Geometries of Sample Dependence in Probability Simplex
——0——
Varying the operating level (=rationality/regularization parameter) traces a lower frontier: the best empirical loss attainable for a given amount of native sample dependence. This is the learner’s native rate-distortion curve.
This cost induces a geometry of sample dependence. KL gives the Shannon mutual-information geometry; other regularizers give other native coordinates. In practice, architecture and optimization determine the effective response law.
3:23 PM · May 19, 2026 · 131 Views
3:23 PM · May 19, 2026 · 127 Views