4h ago

DALL·E 2 co-creator Alex Nichol proposes a competition to design the most efficient fixed-size tokenizer vocabulary

He seeks alternative payout methods to avoid tax paperwork.

0
Original post

I want to host a tokenizer competition: submit a vocab of a fixed size, and your score is how few tokens a fixed text (maybe a predetermined book) gets encoded to. However, it's kind of annoying to "pay" a winner; you have to provide tax docs etc. Anybody have any suggestions?

5:00 PM · May 30, 2026 View on X

@unixpickle the threshold is now 2000 not 600

Alex NicholAlex Nichol@unixpickle

I want to host a tokenizer competition: submit a vocab of a fixed size, and your score is how few tokens a fixed text (maybe a predetermined book) gets encoded to. However, it's kind of annoying to "pay" a winner; you have to provide tax docs etc. Anybody have any suggestions?

12:00 AM · May 31, 2026 · 1.9K Views
4:25 AM · May 31, 2026 · 101 Views

@unixpickle i think people would compete even without a prize. keller‘s audience is always down for a competition it seems, just for public bragging rights and a tweet

Alex NicholAlex Nichol@unixpickle

I want to host a tokenizer competition: submit a vocab of a fixed size, and your score is how few tokens a fixed text (maybe a predetermined book) gets encoded to. However, it's kind of annoying to "pay" a winner; you have to provide tax docs etc. Anybody have any suggestions?

12:00 AM · May 31, 2026 · 1.9K Views
4:32 AM · May 31, 2026 · 255 Views

Also, how much prize money would it take for you to try this.

Alex NicholAlex Nichol@unixpickle

I want to host a tokenizer competition: submit a vocab of a fixed size, and your score is how few tokens a fixed text (maybe a predetermined book) gets encoded to. However, it's kind of annoying to "pay" a winner; you have to provide tax docs etc. Anybody have any suggestions?

12:00 AM · May 31, 2026 · 1.9K Views
12:00 AM · May 31, 2026 · 784 Views