DALL·E 2 co-creator Alex Nichol proposes a competition to design the most efficient fixed-size tokenizer vocabulary
He seeks alternative payout methods to avoid tax paperwork.
@unixpickle the threshold is now 2000 not 600
I want to host a tokenizer competition: submit a vocab of a fixed size, and your score is how few tokens a fixed text (maybe a predetermined book) gets encoded to. However, it's kind of annoying to "pay" a winner; you have to provide tax docs etc. Anybody have any suggestions?
@unixpickle i think people would compete even without a prize. keller‘s audience is always down for a competition it seems, just for public bragging rights and a tweet
I want to host a tokenizer competition: submit a vocab of a fixed size, and your score is how few tokens a fixed text (maybe a predetermined book) gets encoded to. However, it's kind of annoying to "pay" a winner; you have to provide tax docs etc. Anybody have any suggestions?
Also, how much prize money would it take for you to try this.
I want to host a tokenizer competition: submit a vocab of a fixed size, and your score is how few tokens a fixed text (maybe a predetermined book) gets encoded to. However, it's kind of annoying to "pay" a winner; you have to provide tax docs etc. Anybody have any suggestions?