PrimeIntellect introduces Renderers boosting RL throughput over 3x

QUOTE POST

Glad to see this -- renderers are a foundational component of the LLM stack. Renderers map between tokens and messages, which are invariant to tokenizer and formatting details. Most APIs, datasets, and RL environments are defined in terms of messages. Getting the details wrong leads to train-test mismatches, caching inefficiencies, and prompt injection vulnerabilities. We included a renderers module in Tinker Cookbook, but it makes sense as a standalone library.

Prime Intellect@PrimeIntellect

Introducing Renderers RL trainers work in tokens. Environments work in messages. Going back and forth corrupts sampled tokens, wasting compute on every agentic turn. With Renderers, we fix this mismatch. This unlocks >3x throughput on popular open models.

11:44 PM · May 12, 2026 · 169K Views

4:07 PM · May 29, 2026 · 15.6K Views

QUOTE POST

#64Nathan Lambert@NATOLAMBERT

The jinja chat template has always felt like a temporary equilibrium, so we've needed someone to take the reigns and try to build that out within the community.

Excited about this!

Prime Intellect@PrimeIntellect

Introducing Renderers RL trainers work in tokens. Environments work in messages. Going back and forth corrupts sampled tokens, wasting compute on every agentic turn. With Renderers, we fix this mismatch. This unlocks >3x throughput on popular open models.

11:44 PM · May 12, 2026 · 169K Views

11:59 PM · May 12, 2026 · 14K Views

REPLY

#64Nathan Lambert@NATOLAMBERT

Harmony was the first attempt at this imo, but it never broke out of the OpenAI model ecosystem. I'm honestly not sure why, but would guess lack of community effort https://github.com/openai/harmony

Nathan Lambert@natolambert

The jinja chat template has always felt like a temporary equilibrium, so we've needed someone to take the reigns and try to build that out within the community. Excited about this!

11:59 PM · May 12, 2026 · 14K Views

12:01 AM · May 13, 2026 · 2.5K Views

REPLY

#64Nathan Lambert@NATOLAMBERT

@willccbb @vllm_project @sgl_project @huggingface @tinkerapi confirmado

will brown@willccbb

all chat templates are wrong, some chat templates are useful we found some CRAZY performance wins by patching official templates, and we're shipping them in a standalone library you can use with any RL stack w/ examples for @vllm_project @sgl_project @huggingface @tinkerapi

11:50 PM · May 12, 2026 · 40.1K Views

12:00 AM · May 13, 2026 · 1.3K Views

REPLY

#64Nathan Lambert@NATOLAMBERT

@willccbb @vllm_project @sgl_project @huggingface @tinkerapi src https://rlhfbook.com/teach/course/lec2-chap4-5-9/#14

Nathan Lambert@natolambert

@willccbb @vllm_project @sgl_project @huggingface @tinkerapi confirmado

12:00 AM · May 13, 2026 · 1.3K Views

12:00 AM · May 13, 2026 · 514 Views

QUOTE POST

#120Taco Cohen@TACOCOHEN

A gift from the Gods. Dealing with multiple models and many envs in the same RL codebase while respecting correctness constraints (no train / inference tokenization mismatch) is becoming a huge pain.

I have a vibe-coded draft PR that does exactly this, but happy I won’t have to land or maintain it now. Let’s hope the field can really standardize on one abstraction.

Prime Intellect@PrimeIntellect

Introducing Renderers RL trainers work in tokens. Environments work in messages. Going back and forth corrupts sampled tokens, wasting compute on every agentic turn. With Renderers, we fix this mismatch. This unlocks >3x throughput on popular open models.

11:44 PM · May 12, 2026 · 169K Views

9:47 AM · May 13, 2026 · 14.6K Views

REPLY

#228Andreas Kirsch 🇺🇦@BLACKHC

@TacoCohen Very cool! I think tinker from @thinkymachines had that API as well

Taco Cohen@TacoCohen

A gift from the Gods. Dealing with multiple models and many envs in the same RL codebase while respecting correctness constraints (no train / inference tokenization mismatch) is becoming a huge pain. I have a vibe-coded draft PR that does exactly this, but happy I won’t have to land or maintain it now. Let’s hope the field can really standardize on one abstraction.

9:47 AM · May 13, 2026 · 14.6K Views

10:55 AM · May 13, 2026 · 363 Views

REPLY

#228Andreas Kirsch 🇺🇦@BLACKHC

@TacoCohen @hallerite @thinkymachines https://github.com/thinking-machines-lab/tinker-cookbook/tree/main/tinker_cookbook/renderers

http://base.py has the ABCs

11:33 AM · May 13, 2026 · 57 Views

POST

#339will brown@WILLCCBB

some of our fav bugs on the road to `renderers`

read all about it: https://www.primeintellect.ai/blog/renderers

6:54 AM · May 13, 2026 · 4.8K Views

REPLY

#339will brown@WILLCCBB

go render some tokens:

github.com

GitHub - PrimeIntellect-ai/renderers

Contribute to PrimeIntellect-ai/renderers development by creating an account on GitHub.

will brown@willccbb

some of our fav bugs on the road to `renderers` read all about it: https://www.primeintellect.ai/blog/renderers

6:54 AM · May 13, 2026 · 4.8K Views

7:04 AM · May 13, 2026 · 1.1K Views

QUOTE POST

#339will brown@WILLCCBB

all chat templates are wrong, some chat templates are useful

we found some CRAZY performance wins by patching official templates, and we're shipping them in a standalone library you can use with any RL stack

w/ examples for @vllm_project @sgl_project @huggingface @tinkerapi

Prime Intellect@PrimeIntellect

Introducing Renderers RL trainers work in tokens. Environments work in messages. Going back and forth corrupts sampled tokens, wasting compute on every agentic turn. With Renderers, we fix this mismatch. This unlocks >3x throughput on popular open models.

11:44 PM · May 12, 2026 · 169K Views

11:50 PM · May 12, 2026 · 40.1K Views

REPLY

#339will brown@WILLCCBB

the core of the issue is that both encoding and parsing are many-to-one

vanilla TITO does prefix lookup in token-space, which misses many rendering collisions

the solution is to do lookup in message-space, then input prep in token-space, which we call bridge_to_next_turn

will brown@willccbb

all chat templates are wrong, some chat templates are useful we found some CRAZY performance wins by patching official templates, and we're shipping them in a standalone library you can use with any RL stack w/ examples for @vllm_project @sgl_project @huggingface @tinkerapi

11:50 PM · May 12, 2026 · 40.1K Views

11:57 PM · May 12, 2026 · 2K Views

REPLY

#339will brown@WILLCCBB

@vllm_project @sgl_project @huggingface @tinkerapi we're intending for this to become a programmable source of truth for template implementations so that we can finally get rid of jinja

lots here already, but PRs welcome for all models!

will brown@willccbb

the core of the issue is that both encoding and parsing are many-to-one vanilla TITO does prefix lookup in token-space, which misses many rendering collisions the solution is to do lookup in message-space, then input prep in token-space, which we call bridge_to_next_turn

11:57 PM · May 12, 2026 · 2K Views

12:10 AM · May 13, 2026 · 1.7K Views

QUOTE POST

#339will brown@WILLCCBB

@vllm_project @sgl_project @huggingface @tinkerapi from a live run:

12:21 AM · May 13, 2026 · 1.4K Views

QUOTE POST

#707Vincent Weisser@VINCENTWEISSER

We are open sourcing renderers

For RL, the inference server should be simple Tokens in, tokens out

renderers is the token-level chat templating layer to >render messages to tokens >parse completions to structure >bridge rollouts byte-for-byte > >3x throughput on openmodels

Prime Intellect@PrimeIntellect

Introducing Renderers RL trainers work in tokens. Environments work in messages. Going back and forth corrupts sampled tokens, wasting compute on every agentic turn. With Renderers, we fix this mismatch. This unlocks >3x throughput on popular open models.

11:44 PM · May 12, 2026 · 169K Views

12:03 AM · May 13, 2026 · 9.1K Views

QUOTE POST

#1153Florian Brand@XEOPHON

working at prime is just "ugh i had this gnarly problem, let’s fix it and then make it available to everyone"

a ton of other things are coming, can’t wait to show it to yall :)

Prime Intellect@PrimeIntellect

Introducing Renderers RL trainers work in tokens. Environments work in messages. Going back and forth corrupts sampled tokens, wasting compute on every agentic turn. With Renderers, we fix this mismatch. This unlocks >3x throughput on popular open models.

11:44 PM · May 12, 2026 · 169K Views

6:40 AM · May 13, 2026 · 4.5K Views

QUOTE POST

#1186Johannes Hagemann@JOHANNES_HAGE

never again

Prime Intellect@PrimeIntellect

Introducing Renderers RL trainers work in tokens. Environments work in messages. Going back and forth corrupts sampled tokens, wasting compute on every agentic turn. With Renderers, we fix this mismatch. This unlocks >3x throughput on popular open models.

11:44 PM · May 12, 2026 · 169K Views

11:50 PM · May 12, 2026 · 6K Views

PrimeIntellect introduces Renderers boosting RL throughput over 3x

Sentiment

Cluster engagement