/Tech20d ago

Dimitris Papailiopoulos of Microsoft Research AI Frontiers limits solver challenge models to 10 million trainable weights

The update bans access to the evaluation test set

58011K

#217

Original post

Dimitris Papailiopoulos@DimitrisPapail#217inTech

@xeophon I will relax them to make it more interesting 1) No more than 10M trainable weights in the solver 2) Can't peak into the test set

Florian Brand@xeophon

@DimitrisPapail don’t tempt me

what are the rules?

6:55 AM · May 30, 2026 · 225 Views

Sentiment

Many users praised the ML solver challenge as a cute project and good fun approach worth trying with its outlined rules.

Pos

100.0%

Neg

0.0%

4 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS435REPLIES1

Dimitris Papailiopoulos@DimitrisPapail

@alexjc 1) No more than 10M trainable weights in the solver 2) Can't use frozen models/api calls 3) Can't peak into the test set

Alex J. Champandard 🌱@alexjc

@DimitrisPapail If you're willing to include things like units as hard-coded hints you can get more than 15%... what were your rules?

20d43510

BOOKMARKS1LIKES2

Dimitris Papailiopoulos@DimitrisPapail

@xeophon also 3) can't use frozen models (i thought about a variant of that but it's not very symbolic) where you take a frozen model, and finetune linear probes at the last layer. But that's a different project altogether :p

Dimitris Papailiopoulos@DimitrisPapail

@xeophon I will relax them to make it more interesting 1) No more than 10M trainable weights in the solver 2) Can't peak into the test set

20d21921

Dimitris Papailiopoulos@DimitrisPapail

@alexjc i think above 20% is extremely hard!

Alex J. Champandard 🌱@alexjc

@DimitrisPapail OK, that's not what I imagined as pure Python program! With trainable weights it's a good approach, but with those rules and a narrow focus I think 50-60% (or more) should be the target? Maybe I should dig out my prototypes to try to add more parameters...

20d7020

Alex J. Champandard 🌱@alexjc

@DimitrisPapail 0-shot and pass@1? I will see if I can make some time, sounds fun to dig in again...

20d441

Alex J. Champandard 🌱@alexjc

Dimitris Papailiopoulos@DimitrisPapail

@alexjc 1) No more than 10M trainable weights in the solver 2) Can't use frozen models/api calls 3) Can't peak into the test set

20d8500

Dimitris Papailiopoulos@DimitrisPapail

@xeophon it's a cute project!

20d282

Dimitris Papailiopoulos@DimitrisPapail

@alexjc yes :)

20d30