1d ago

Paper Presents Cross-Tokenizer Distillation Across Qwen, Phi, And Llama Models

Sentiment

Pos100%

Neg0%

Positive users describe results from distilling Qwen, Phi, and Llama into one 1B model as promising and express optimism about upcoming scaling and extensions to SFT/RL checkpoints.

1 comment with sentiment.

Paper Presents Cross-Tokenizer Distillation Across Qwen, Phi, And Llama Models · Digg