Very proud to have funded this work in my previous role @ARIA_research. I claimed that in environments with formal world-models, RL can be used to generate proof-carrying policies by just designing the right reward function, and this is a big theoretical and empirical validation.
RL Value Functions Act as Supermartingale Certificates for Stochastic Verification
15011192.9K
11:26 AM 路 Jun 10, 2026 路 2.6K Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS351BOOKMARKS3LIKES5
davidad 馃巼@davidad
https://arxiv.org/abs/2605.31524
davidad 馃巼@davidad
Very proud to have funded this work in my previous role @ARIA_research. I claimed that in environments with formal world-models, RL can be used to generate proof-carrying policies by just designing the right reward function, and this is a big theoretical and empirical validation.
3hViews 351Likes 5Bookmarks 3