@suchenzang @Alibaba_Qwen To be fair, AFAIU, this would be agentic traces for posttrain and so 29mio sessions can be a good chunk of tokens!
instead of indulging in all this fear-mongering frontier-stealing narrative slop, @Alibaba_Qwen might as well release all "29 million Claude exchanges" for the benefit of open-source research and call it a day
let the public decide for themselves how "dangerously critical" these tokens actually are, relative to the other trillions upon trillions being used to train these models