2h ago

Experts Release DPrivBench To Test LLMs On Differential Privacy Reasoning

0
Original post

Can frontier models reasoning about Differential Privacy? We are excited to release DPrivBench, a benchmark curated by DP experts to evaluate LLMs’ reasoning ability on Differential Privacy. @pengrun_huang @chien_eli @omthkkr @kamalikac @yuxiangw_cs @ruihan_w DPrivBench contains two complementary tracks: Category 1 (preliminary): fundamental DP mechanism questions focused on sensitivity calculation and noise calibration. Category 2 (advanced): research-level DP algorithms from the literature that require advanced, algorithm-specific mathematical reasoning. Our findings are both promising and cautionary: frontier models are strong on Category 1, yet still struggle with research-level DP algorithms requiring nuanced reasoning about assumptions, privacy accounting, and algorithm-specific guarantees.

4:32 PM · May 22, 2026 View on X
Reposted by