/Tech2h ago

GLM 5.2 reproduces the Step-DPO research paper for $6.21, beating Opus 4.8's $46.35 cost

The model used fewer tokens despite more failed runs

6956327.8K

#403

Original post

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex#501inTech

This is the beginning of serious autoresearch that can't be withdrawn with the push of a button by some responsible AI safety committee. No matter what happens next, we have early AI scientist assistants already. RSI can be local, if slow.

alphaXiv@askalphaxiv

Here’s a fun comparison between GLM 5.2 and Opus 4.8 on a one-shot reproduction of the SDPO paper

This is a hard task: the model must resolve messy verl issues and then run ablations to completion and confirm the paper’s claims.

- GLM 5.2 costs $6.21 while Opus 4.8 cost us $46.35

- Both models spent a bulk of their tokens resolving initial verl issues. GLM 5.2 attempted 14 failed runs before first success while Opus 4.8 attempted 9 runs.

- GLM 5.2 surprisingly took 2.65M tokens (excl re-reads) compared to 4.53M tokens for Opus 4.8

8:50 AM · Jun 25, 2026 · 7K Views

Sentiment

Users approve of GLM 5.2 reproducing research papers at a fraction of Opus 4.8 cost because it enables open permission-free scientific evolution.

Pos

100.0%

Neg

0.0%

1 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Posts from X

Most Activity

VIEWS513LIKES6

kache@yacineMTB

@teortaxesTex first autoresearch task.. make yourself faster at researching : )

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

2h51360

Griefcliff@griefcliff

@teortaxesTex Local LLMs should play a little chip tune when you start them

2h14

小芝麻 · 慢慢聊@BorhenZd

@teortaxesTex 这才是无需许可的科研进化逻辑

2h1