Good day to read this from @PeterHndrsn and Mark Lemley (afaik not on Twitter)
I promised I would post the letter Dario Amodei sent to the White House and Senators Tim Scott and Elizabeth Warren as soon as it became available:
The paper analyzes contractual limits on model output distillation
Good day to read this from @PeterHndrsn and Mark Lemley (afaik not on Twitter)
I promised I would post the letter Dario Amodei sent to the White House and Senators Tim Scott and Elizabeth Warren as soon as it became available:
No Digg Deeper questions have been answered for this story yet.
@BlackHC to paint a picture: It's not clear the evidence they have for this being crucial to training, why they dont just stop the use if it violates the ToS, and why they need to share all of this with such vitriol when it could be an corporate lawsuit
@natolambert Why? Distillation is against their ToS and the raised points are correct. Why should distillation be okay when they use bots and VPNs to circumvent access restrictions and also actually pay less than regular API users
@natolambert Corporate lawsuit where exactly? Sue them in China?
They are resorting to identity verification now to stop them which is not great
Should they get an injunction in the US to ban the respective open-source models?
@BlackHC to paint a picture: It's not clear the evidence they have for this being crucial to training, why they dont just stop the use if it violates the ToS, and why they need to share all of this with such vitriol when it could be an corporate lawsuit

https://arxiv.org/abs/2412.07066
@BlackHC @natolambert I’m not completely convinced labs are in a good position to debate shady data access…
@natolambert Why? Distillation is against their ToS and the raised points are correct. Why should distillation be okay when they use bots and VPNs to circumvent access restrictions and also actually pay less than regular API users