2h ago

Prime Intellect launches Hosted Evaluations, managing sandboxes and compute infrastructure to simplify complex AI model benchmarks

The platform supports testing models like Claude Opus 4.7.

0
Original post

Today, we are launching Hosted Evaluations on the platform. Running evals is an infra problem: harnesses, sandboxes, hours of compute, hundreds of parallel runs. Running evals is hard. Until now.

12:21 PM · May 30, 2026 View on X

the way to make post-training easier to build powerful general tooling where training is an opt-in feature

evals are environments

Prime IntellectPrime Intellect@PrimeIntellect

Today, we are launching Hosted Evaluations on the platform. Running evals is an infra problem: harnesses, sandboxes, hours of compute, hundreds of parallel runs. Running evals is hard. Until now.

7:21 PM · May 30, 2026 · 21.4K Views
9:39 PM · May 30, 2026 · 19 Views

@eliebakouch wait so the RL rollout viewer is the same as the eval rollout viewer? does this mean evals and environments are the same thing?? and people can go from evals to post-training with a single command???

elieelie@eliebakouch

look at how beautiful this rollouts viewer is, never been easier to create, run and look at (eval) data

9:17 PM · May 30, 2026 · 546 Views
9:33 PM · May 30, 2026 · 121 Views

look at how beautiful this rollouts viewer is, never been easier to create, run and look at (eval) data

Prime IntellectPrime Intellect@PrimeIntellect

Today, we are launching Hosted Evaluations on the platform. Running evals is an infra problem: harnesses, sandboxes, hours of compute, hundreds of parallel runs. Running evals is hard. Until now.

7:21 PM · May 30, 2026 · 21.4K Views
9:17 PM · May 30, 2026 · 546 Views

deep dive by @xeophon here:

Florian BrandFlorian Brand@xeophon

Hosted evals are finally live!! Smal vid showing how to use them, more to come : )

7:23 PM · May 30, 2026 · 7.6K Views
9:18 PM · May 30, 2026 · 138 Views

it’s beautiful

Prime IntellectPrime Intellect@PrimeIntellect

Today, we are launching Hosted Evaluations on the platform. Running evals is an infra problem: harnesses, sandboxes, hours of compute, hundreds of parallel runs. Running evals is hard. Until now.

7:21 PM · May 30, 2026 · 21.4K Views
7:45 PM · May 30, 2026 · 3.3K Views

@xeophon @vincentweisser Managing eval infra? Dw about it kitten.

Florian BrandFlorian Brand@xeophon

Hosted evals are finally live!! Smal vid showing how to use them, more to come : )

7:23 PM · May 30, 2026 · 7.6K Views
7:38 PM · May 30, 2026 · 471 Views

@johannes_hage @dominik_scherm .@dominik_scherm is the 🐐 for solving all my weird setups

Johannes HagemannJohannes Hagemann@johannes_hage

.@xeophon & @dominik_scherm have been cooking on the smoothest experience to run your evals read more: https://www.primeintellect.ai/blog/hosted-evaluations

8:02 PM · May 30, 2026 · 1.7K Views
8:03 PM · May 30, 2026 · 305 Views

Hosted evals are finally live!!

Smal vid showing how to use them, more to come : )

Prime IntellectPrime Intellect@PrimeIntellect

Today, we are launching Hosted Evaluations on the platform. Running evals is an infra problem: harnesses, sandboxes, hours of compute, hundreds of parallel runs. Running evals is hard. Until now.

7:21 PM · May 30, 2026 · 21.4K Views
7:23 PM · May 30, 2026 · 7.6K Views

.@xeophon & @dominik_scherm have been cooking on the smoothest experience to run your evals

read more: https://www.primeintellect.ai/blog/hosted-evaluations

Prime IntellectPrime Intellect@PrimeIntellect

Today, we are launching Hosted Evaluations on the platform. Running evals is an infra problem: harnesses, sandboxes, hours of compute, hundreds of parallel runs. Running evals is hard. Until now.

7:21 PM · May 30, 2026 · 21.4K Views
8:02 PM · May 30, 2026 · 1.7K Views
Prime Intellect launches Hosted Evaluations, managing sandboxes and compute infrastructure to simplify complex AI model benchmarks · Digg