2h ago

Open-source AI builder @xlr8harder argues model distillation is more effective than crowdsourcing for resolving underlying technical problems

Research engineer Florian Brand confirmed the assessment.

0
Original post

@xeophon I'm not sure I actually do (unless the answer is that it doesn't.) Crowd sourcing could do some of this but all of the efforts so far have been quite lackluster. Oh, just distillation, I guess?

12:47 AM · May 30, 2026 View on X

@xlr8harder @xeophon you don’t have to

closed model needs to be good at every use case

open model needs to be good at your use case

just do last-mile training on your own tasks and data

xlr8harderxlr8harder@xlr8harder

@xeophon I'm not sure I actually do (unless the answer is that it doesn't.) Crowd sourcing could do some of this but all of the efforts so far have been quite lackluster. Oh, just distillation, I guess?

7:47 AM · May 30, 2026 · 105 Views
8:10 AM · May 30, 2026 · 50 Views

@xlr8harder @xeophon open models don’t need to beat closed models outright, they just need to be close enough that you can bridge the gap and then some, relatively quickly and cheaply

for anything served at scale, you can amortize out the training cost pretty quickly, and retrain regularly

will brownwill brown@willccbb

@xlr8harder @xeophon you don’t have to closed model needs to be good at every use case open model needs to be good at your use case just do last-mile training on your own tasks and data

8:10 AM · May 30, 2026 · 50 Views
8:12 AM · May 30, 2026 · 55 Views

@xlr8harder @xeophon crowdsourcing will come in due time; have learned a lot of lessons here, you need activation energy to be super low and QC to be really high, but these are solvable with the right tooling

will brownwill brown@willccbb

@xlr8harder @xeophon open models don’t need to beat closed models outright, they just need to be close enough that you can bridge the gap and then some, relatively quickly and cheaply for anything served at scale, you can amortize out the training cost pretty quickly, and retrain regularly

8:12 AM · May 30, 2026 · 55 Views
8:13 AM · May 30, 2026 · 49 Views

@xlr8harder ding ding ding

xlr8harderxlr8harder@xlr8harder

@xeophon I'm not sure I actually do (unless the answer is that it doesn't.) Crowd sourcing could do some of this but all of the efforts so far have been quite lackluster. Oh, just distillation, I guess?

7:47 AM · May 30, 2026 · 105 Views
7:48 AM · May 30, 2026 · 80 Views

@willccbb @xeophon This is related to a question I've been exploring. Say you need to do meaningful domain adaptation, the kind of thing you'd probably need CPT for. Is there any good way to do this without wrecking post-trained behavior?

This seems like it would be very valuable.

will brownwill brown@willccbb

@xlr8harder @xeophon you don’t have to closed model needs to be good at every use case open model needs to be good at your use case just do last-mile training on your own tasks and data

8:10 AM · May 30, 2026 · 50 Views
10:22 AM · May 30, 2026 · 22 Views
Open-source AI builder @xlr8harder argues model distillation is more effective than crowdsourcing for resolving underlying technical problems · Digg