So, how does open source match this?
SITUATION EXPLAINED: How much are frontier labs actually spending on training data?
.@SeanZCai: "Frontier labs are spending about $10 to $15 billion per lab on data."
"Really good long horizon tasks go up to $20,000 each. A complete browser-use version of SAP was rumored at $500,000."
"Despite everybody thinking the market is super crowded, we still don't have enough good quality data vendors that actually understand how to deliver product plus services in a way researchers are looking for."
"I have not seen a contract for genuinely good data gets turned down because of budgetary concerns yet."







