/Tech2h ago

Chess Model Training Costs Scale Cubically With Compute

513021.2K

Original post

sophia@cis_female#1281inTech

training cost for chess models is cubic in flops: 2x bigger model means 2x more flops/position, 2x more positions necessary, and probably 2x more flops needed to generate training data

2:18 AM · Jun 10, 2026 · 780 Views

/Tech2h ago

Chess Model Training Costs Scale Cubically With Compute

513021.2K

#1281

Original post

sophia@cis_female#1281inTech

training cost for chess models is cubic in flops: 2x bigger model means 2x more flops/position, 2x more positions necessary, and probably 2x more flops needed to generate training data

2:18 AM · Jun 10, 2026 · 780 Views

Sentiment

Users criticized chess model training costs scaling cubically with compute as raw cubic pain without PR spin.

Pos

0.0%

Neg

100.0%

1 comments with sentiment.

Cluster Engagement

Posts from X

Most Activity

VIEWS268LIKES2REPLIES1

sophia@cis_female

#3 isn't that true for anything at or below ~10gflops (lc0 model size) but I think true if you want to make a new big model.

the "2x more positions" point is subtle: in chess, like in LLMs, bigger models need less training to get to a certain quality level. but 2x bigger model means 2x fewer nodes you can inference, and for bigger models to be worth it you almost certainly need to train them to capacity, which means roughly 2x more positions.

sophia@cis_female

training cost for chess models is cubic in flops: 2x bigger model means 2x more flops/position, 2x more positions necessary, and probably 2x more flops needed to generate training data

2h26820

sophia@cis_female

the last claim is that for small models, just getting in the ballpark of correct is fine, because you'll be looking at tons of positions and can calibrate that way. But for big models, you need extremely precise calibration. Again this is needed because otherwise you cannot make real use of the big models

sophia@cis_female

#3 isn't that true for anything at or below ~10gflops (lc0 model size) but I think true if you want to make a new big model.

1h18610

Rugbist@rugbist_

@cis_female training scaling laws but without the beautiful PR spin

just raw cubic pain