Prime Intellect introduces General-Agent, a fully synthetic reinforcement learning environment that generates self-evolving tool-use tasks with 4,504 examples across 1,040 domains and 8,159 tools

VIEWS104.4KBOOKMARKS452LIKES818REPLIES9

kache@yacineMTB

This is probably why gpt 5.5 is so ridiculously good at decompiling and reverse engineering steam games huh

Prime Intellect@PrimeIntellect

The next step toward automating AI is automating RL environments

Introducing General-Agent: A fully synthetic environment whose task corpus self-evolves and grows harder over time

4,504 tool-use tasks · 1,040 domains · 8,159 unique tools

42d104.4K818452

RETWEETS117

Prime Intellect@PrimeIntellect

The next step toward automating AI is automating RL environments

Introducing General-Agent: A fully synthetic environment whose task corpus self-evolves and grows harder over time

4,504 tool-use tasks · 1,040 domains · 8,159 unique tools

42d277K1.3K1K

Viv@Vtrivedy10

awesome work by the PI team 👏

I think we’re still in the very early days of exciting research around generating high quality Agent Training Envs/Evals as multi-player games

Tweeted AlphaEval which we were riffing on last month, super excited to dig into this, some great ideas here + more to swarm on - Curriculum Learning and Tiers of Difficulty in Envs/Evals - 2 Player setup vs >=3 player setup with a judge - Gating + validation mechanisms - Grounding in production traces vs not?

this type of data generation at scale shows sparks of being able to adapt agents to any domain without tons of data curation (besides good rubrics and game design)

great drop 🔥

Prime Intellect@PrimeIntellect

The next step toward automating AI is automating RL environments

Introducing General-Agent: A fully synthetic environment whose task corpus self-evolves and grows harder over time

4,504 tool-use tasks · 1,040 domains · 8,159 unique tools

42d25.1K239289

λux@novasarc01

absolutely loving this. for the past few months i’ve also been exploring synthetic and procedural environment generation for agents (though at a much smaller and earlier-stage scale). the projects i’ve been working on (synthetic workspace gym and rlvr gym) include a small set of environment families such as python script repair, pipeline repair, tabular tasks, scheduling, graph planning and retrieval-style workspace tasks, with workspace artifact logging, hidden evaluators, trajectory traces and final diffs. the motivation was mostly curiosity...prompts alone feel too weak as a substrate for studying long-horizon agents. what we really need are executable worlds where agents can inspect state, take actions, receive grounded feedback and fail in ways that are actually analyzable.

that is why i’m especially excited by what the prime intellect team has done here. we need thousands of diverse, verifiable, stateful tasks across domains not just static prompt datasets or a handful of handcrafted benchmarks. imo scaling this lets us study curriculum, tool use, verifier design, task difficulty, trajectory quality, reward hacking and long-horizon generalization in a much more systematic way.

my own versions are still early and small but it’s really exciting to see PI push this direction seriously: evolving environments, calibrating difficulty and generating broad task distributions across thousands of domains. also i really loved the clean and neat blog post!

Prime Intellect@PrimeIntellect

The next step toward automating AI is automating RL environments

Introducing General-Agent: A fully synthetic environment whose task corpus self-evolves and grows harder over time

4,504 tool-use tasks · 1,040 domains · 8,159 unique tools

42d15.5K180172

Prime Intellect@PrimeIntellect

Blog:

https://www.primeintellect.ai/blog/general-agent

Prime Intellect@PrimeIntellect

For more details, check out the environment and the blog post

https://app.primeintellect.ai/dashboard/environments/primeintellect/general-agent

42d8.5K126104

Vincent Weisser@vincentweisser

Automating RL environments is the next step toward automating everything else.

Introducing general-agent by @mikasenghaas > open agentic environments with 1000s of tools are scarce, so we're building one that builds itself > A synthesizer evolves tasks across difficulty tiers, empirically gated by a solver. Hard tiers seed the next wave, hillclimbing toward frontier-level difficulty. > 4,504 tasks / 1,040 domains / 8,159 unique tools

Prime Intellect@PrimeIntellect

The next step toward automating AI is automating RL environments

Introducing General-Agent: A fully synthetic environment whose task corpus self-evolves and grows harder over time

4,504 tool-use tasks · 1,040 domains · 8,159 unique tools

42d14.1K15762

Florian Brand@xeophon

making the tech that closed labs have open and giving it to everyone, one release after another :)

Prime Intellect@PrimeIntellect

The next step toward automating AI is automating RL environments

Introducing General-Agent: A fully synthetic environment whose task corpus self-evolves and grows harder over time

4,504 tool-use tasks · 1,040 domains · 8,159 unique tools

42d20K18854

Mika Senghaas@mikasenghaas

this was a fun side project to increase our set of usable agent tasks with a focus on tool diversity. the thing that stood out to me the most was the process: going from an idea, to environment, to running 1000s of parallel, multi-hour agent episodes was mind-bogglingly easy thanks to our stack. i view the v0 taskset as a preview of what’s to come: byo harness with broad task composability + truly multi-agent training to create, evolve and verify tasks on-the-fly

Prime Intellect@PrimeIntellect

The next step toward automating AI is automating RL environments

Introducing General-Agent: A fully synthetic environment whose task corpus self-evolves and grows harder over time

4,504 tool-use tasks · 1,040 domains · 8,159 unique tools

42d7.3K5618

Johannes Hagemann@johannes_hage

go from idea, to environment, to running 1000s of parallel, multi-hour agent episodes within a day.

open superintelligence stack

Prime Intellect@PrimeIntellect

This work was built entirely on top of our stack

- verifiers — to build the solver and synthesizer - hosted evals — to synthesize tasks at massive scale - hosted training — to validate training behavior

allowing us to go from prototype to thousands of agents running in parallel within a day.

42d7.8K6921

srija@srijatwt

at this point i can wake up every morning and expect to see @PrimeIntellect cook another very cool thing

Prime Intellect@PrimeIntellect

The next step toward automating AI is automating RL environments

Introducing General-Agent: A fully synthetic environment whose task corpus self-evolves and grows harder over time

4,504 tool-use tasks · 1,040 domains · 8,159 unique tools

42d4.9K409

Seth Karten@sethkarten

This is the right attitude towards agents. Continually learning the harness+model

Prime Intellect@PrimeIntellect

This work is a step towards self-improving agents. We believe the environment has many of the right ingredients evolve our tooling and platform towards:

- training agents, not models (train any task in any harness) - compose multiple agents (multi-agent episodes like synthesizer-solver, solver-grader, etc.)

42d4.5K215

Eric W. Tramel@fujikanaeda

so cool -- the space of possible informative environments is as vast and manually enumerating them is intractable. automating diverse construction is critical for making Bouba (and not Kiki) post-trained model behavior & capabilities.

Prime Intellect@PrimeIntellect

The next step toward automating AI is automating RL environments

Introducing General-Agent: A fully synthetic environment whose task corpus self-evolves and grows harder over time

4,504 tool-use tasks · 1,040 domains · 8,159 unique tools

42d1K124

Vincent Weisser@vincentweisser

@mikasenghaas https://www.primeintellect.ai/blog/general-agent

Vincent Weisser@vincentweisser

Automating RL environments is the next step toward automating everything else.

Introducing general-agent by @mikasenghaas > open agentic environments with 1000s of tools are scarce, so we're building one that builds itself > A synthesizer evolves tasks across difficulty tiers, empirically gated by a solver. Hard tiers seed the next wave, hillclimbing toward frontier-level difficulty. > 4,504 tasks / 1,040 domains / 8,159 unique tools

42d606113

Johannes Hagemann@johannes_hage

highly recommend checking out @mikasenghaas full blog post on the general agent environment release with all the details and experiments:

https://www.primeintellect.ai/blog/general-agent

Johannes Hagemann@johannes_hage

go from idea, to environment, to running 1000s of parallel, multi-hour agent episodes within a day.

open superintelligence stack

42d479112

Florian Brand@xeophon

making the tech that closed labs have open auf giving it to everyone, one release after another :)

Prime Intellect@PrimeIntellect

The next step toward automating AI is automating RL environments

Introducing General-Agent: A fully synthetic environment whose task corpus self-evolves and grows harder over time

4,504 tool-use tasks · 1,040 domains · 8,159 unique tools

42d66972

Vincent Weisser@vincentweisser

@PrimeIntellect @mikasenghaas cooked 🐐🐐

42d33111

Prime Intellect@PrimeIntellect

Most environments today are static snapshots. general-agent environments ships two agents capable of synthesizing tasks on-the-fly in a 2-player loop:

A synthesizer agent evolves a task in difficulty tiers. Each tier's difficulty is measured by running a solver agent against it. Only tiers that land in the target pass-rate band are kept; the hardest tiers are used to seed the next wave.

42d2337

Prime Intellect@PrimeIntellect

We let GLM-5.1 and GPT-5-Mini play this game for 2 days to create the initial task corpus of this environment. We then analyzed the task corpus by generating over 200K solver traces from GLM-5.1. We find that solve rate decreases predictably with increasing difficulty tiers as a result of more complex queries that require reasoning over more tools and larger databases.

42d1586

Prime Intellect@PrimeIntellect

For more details, check out the environment and the blog post

https://app.primeintellect.ai/dashboard/environments/primeintellect/general-agent

42d2245

stochasm@stochasticchasm

@vincentweisser @PrimeIntellect @mikasenghaas insane

42d4221