1d ago

Julien Chaumond, Hugging Face CTO, cloned a 68.2 TB dataset in 1 minute 55 seconds using Xet deduplication

The transfer bypassed a 4 TB local storage limit.

โ€”โ€”0โ€”โ€”
Original post

We are starting to be quite bullish about getting in the data infrastructure business. I just cloned 68 TB (while I only have a 4TB local disk) to my @huggingface training bucket in 1 minute 55 seconds, thanks to Xet deduplication and all our infra optimizations. You can host your data processing pipelines on HF and leverage those insane optimizations ๐Ÿ”ฅ

9:37 AM ยท May 28, 2026 View on X
Reposted by
Julien Chaumond, Hugging Face CTO, cloned a 68.2 TB dataset in 1 minute 55 seconds using Xet deduplication ยท Digg