/AI1h ago

Google Researchers Launch ContinuousBench for Differentially Private Synthetic Data

15431.1K
Original postGautam Kamath#210
Alex Bie@alexbieyx

Over the last ~1 year, we've thinking about how to make privacy-preserving synthetic data useful for LLM training.

@_peihanliu's intern project @GoogleResearch takes a step back to measure usefulness.

ContinuousBench is a new benchmark for differentially private synthetic data. We show current methods cannot transfer knowledge effectively, even at 蔚=100.

1/n

Gautam Kamath@thegautamkamath

馃УDoes DP synth text transfer useful knowledge or just superficial style mimicking?馃

Existing benchmarks: saturated馃槙

Introducing ContinuousBench: a hard (curr methods fail at 蔚=100! 馃く) & leakage-proof benchmark for DP synth text!

Followup to our #ICML2024 best paper馃憖 1/n

8:23 AM 路 Jun 8, 2026 路 688 Views
Sentiment
Sentiment building, check back later.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS455LIKES2RETWEETS1
Alex Bie@alexbieyx

We release the datasets and evaluation harness for everything to try. Our hope is that ContinuousBench will help measure and accelerate progress in DP synthetic data.

Check out the full paper for more! w/ Peihan Liu (@_peihanliu ), Lucas Rosenblatt (@lucas ros), Weiwei Kong, Natalia Ponomareva, Gautam Kamath (@thegautamkamath ), Rachel Cummings (@radcummings ), Roxana Geambasu, Yu Gan, Lillian Tsai (@tslilyai).

https://arxiv.org/abs/2606.01849 4/4

1hViews 455Likes 2Bookmarks 0
REPLIES1
Alex Bie@alexbieyx

Peihan wrote an excellent, short blog on ContinuousBench: https://peihanliu.com/posts/continuousbench.html

In ContinuousBench, we curate datasets that are unsolvable without training: - Synthetic articles about fictional pokemon-inspired creatures - Newly scraped news from CommonCrawl

...and ask whether models trained on DP synthetic versions of these datasets can reliably answer questions about the content.

2/n

1hViews 7Likes 1
Alex Bie@alexbieyx

Our core result is that while non-DP synthetic versions of these datasets transfer factual knowledge, DP synthesis does not.

Gains are marginal compared to to the base model, and fall far short of training on the real data.

1hViews 10Likes 1