/Tech1h ago

Maksym Andriushchenko clarifies that PostTrainBench hosts static evaluation traces rather than running active tests like RewardBench

Nathan Lambert acknowledged the repository serves only static outputs.

1401378

#134

Original post

Nathan Lambert@natolambert#134inTech

@maksym_andr ohhhh

Maksym Andriushchenko@maksym_andr

i think there is a difference: the RewardBench HF page hosts an eval set which makes sense to integrate in a CI.

running PostTrainBench, however, doesn't require any new data to be downloaded, except the 7 benchmarks used for it, but those are downloaded using their respective HF pages. so our HF page (https://huggingface.co/datasets/aisa-group/PostTrainBench-Trajectories) is only hosting static traces from our evaluations. this makes the whole thing a bit more mysterious :-)

1:25 PM · Jun 9, 2026 · 40 Views

/Tech1h ago

Maksym Andriushchenko clarifies that PostTrainBench hosts static evaluation traces rather than running active tests like RewardBench

Nathan Lambert acknowledged the repository serves only static outputs.

1401378

#134

Original post

Nathan Lambert@natolambert#134inTech

@maksym_andr ohhhh

Maksym Andriushchenko@maksym_andr

i think there is a difference: the RewardBench HF page hosts an eval set which makes sense to integrate in a CI.

1:25 PM · Jun 9, 2026 · 40 Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

No ranked X posts are available for this story yet.