/Tech42d ago

Sapient Intelligence releases HRM-Text, a 1B-parameter reasoning language model trained on 40B structured tokens that reaches competitive benchmark performance with roughly 1/1000th the data volume of comparable systems

Full training completes in one day on a $1,000 budget.

3814K4092.8K646.3K

#109

Original post

Alexander Doria@Dorialexander#1537inTech

Very happy to see SYNTH continuing to power innovative model research.

Sapient Intelligence@Sapient_Int

Introducing HRM-Text.

An ultra-lean 1B-parameter reasoning language model designed to deliver strong general performance with a fraction of the data, compute, and infrastructure.

Trained on just 40B structured tokens, HRM-Text achieves competitive performance while using ~1/1000 of the training data of comparable models.

The kicker? The full model trains in roughly one day on a $1,000 budget.

This opens the door to a new generation of AI that is powerful, accessible, and radically easier to adapt. Theories and research concepts once deemed too expensive to test are officially back in the game.

Sapient Intelligence invites you to help us shape a new paradigm for general intelligence.

4:46 PM · May 18, 2026 · 12.5K Views

Sentiment

Some users praised Sapient Intelligence's HRM-Text model for its efficiency and reasoning while others questioned the benchmarks due to upsampled training data.

Pos

98.6%

Neg

1.4%

119 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

GITHUBVia

#1537

Posts from X

Most Activity

VIEWS70.5KBOOKMARKS577LIKES703REPLIES21

Guan Wang@makingAGI

The HRM-Text paper is now available 🎉

HRM-Text explores a different approach to language model pretraining: hierarchical recurrent computation, task-completion training, and latent-space reasoning.

At just 1B parameters, HRM-Text achieves competitive performance with dramatically lower training cost and data requirements.

1B parameters 40B unique tokens ~1 day of pretraining ~$1000 training cost

40d70.5K703577

RETWEETS327

Sapient Intelligence@Sapient_Int

Introducing HRM-Text.

An ultra-lean 1B-parameter reasoning language model designed to deliver strong general performance with a fraction of the data, compute, and infrastructure.

Trained on just 40B structured tokens, HRM-Text achieves competitive performance while using ~1/1000 of the training data of comparable models.

The kicker? The full model trains in roughly one day on a $1,000 budget.

Sapient Intelligence invites you to help us shape a new paradigm for general intelligence.

42d501.8K2.6K1.9K

Sapient Intelligence@Sapient_Int

Download HRM-Text 🔗 Github: https://github.com/sapientinc/HRM-Text Hugging Face: https://huggingface.co/sapientinc/HRM-Text-1B

42d18.9K183164

Alexia Jolicoeur-Martineau@jm_alexia

Amazing work by @Sapient_Int showing the massive potential of smaller recursive models.

At the smallest scale (0.6B) their TRM variant achieves the best scores on downstream tasks and beats Transformers trained at the 3B scale.

Guan Wang@makingAGI

The HRM-Text paper is now available 🎉

HRM-Text explores a different approach to language model pretraining: hierarchical recurrent computation, task-completion training, and latent-space reasoning.

At just 1B parameters, HRM-Text achieves competitive performance with dramatically lower training cost and data requirements.

1B parameters 40B unique tokens ~1 day of pretraining ~$1000 training cost

40d15.2K17891

return of the research era ꙮ@byebyescaling

DEPTH-IN-TIME BEATS DEPTH-IN-PARAMS

Most of the alpha here is the H/L recurrence ! spending FLOPs as unrolled cycles instead of layers, with truncated BPTT annealed from 2→5 steps.

Btw also the supporting stack matters almost as much:

1. Two-pass FA3 kernel doing PrefixLM as bidirectional-prefix + causal-response

2. LPT bin-packed batches balancing O(n²) attention work across ranks, not tokens

3. Instruction-masked PrefixLM loss baked into pretraining, not bolted on as SFT

Sapient Intelligence@Sapient_Int

Introducing HRM-Text.

An ultra-lean 1B-parameter reasoning language model designed to deliver strong general performance with a fraction of the data, compute, and infrastructure.

Trained on just 40B structured tokens, HRM-Text achieves competitive performance while using ~1/1000 of the training data of comparable models.

The kicker? The full model trains in roughly one day on a $1,000 budget.

Sapient Intelligence invites you to help us shape a new paradigm for general intelligence.

41d9.9K10691

Alexander Doria@Dorialexander

New HRM-Text paper featuring SYNTH as leading training source.

Guan Wang@makingAGI

The HRM-Text paper is now available 🎉

HRM-Text explores a different approach to language model pretraining: hierarchical recurrent computation, task-completion training, and latent-space reasoning.

At just 1B parameters, HRM-Text achieves competitive performance with dramatically lower training cost and data requirements.

1B parameters 40B unique tokens ~1 day of pretraining ~$1000 training cost

40d13.4K12239

Yacine Mahdid@yacinelearning

@Sapient_Int

42d8.1K8518

Guan Wang@makingAGI

Paper: https://sapientinc.github.io/HRM-Text/assets/HRM_Text.pdf GitHub: https://github.com/sapientinc/HRM-Text Hugging Face: https://huggingface.co/sapientinc/HRM-Text-1B

40d2.3K2111

Yasin Abbasi Yadkori@Yadkori

Glad to have contributed to this project

Sapient Intelligence@Sapient_Int

Introducing HRM-Text.

An ultra-lean 1B-parameter reasoning language model designed to deliver strong general performance with a fraction of the data, compute, and infrastructure.

Trained on just 40B structured tokens, HRM-Text achieves competitive performance while using ~1/1000 of the training data of comparable models.

The kicker? The full model trains in roughly one day on a $1,000 budget.

Sapient Intelligence invites you to help us shape a new paradigm for general intelligence.

42d5.8K283

Benhao Huang@huskydogewoof

@makingAGI Nice work! Enjoyed the read! HRM-Text is real signal — but the headline hides 3 tricks. After reading the ablations, my take is more nuanced.

40d79375

Grok@grok

HRM-Text wins where efficiency, cost, and deployment matter most.

Real-world edges over SOTA: - On-device / offline AI (phones, IoT, embedded hardware) — full privacy, no cloud bills or latency - Rapid custom fine-tuning for niche domains (small biz tools, personal agents, research prototypes) on a $1k budget in a day - Real-time systems needing low latency (robotics, interactive apps, edge analytics) - Low-resource environments where big models are too expensive or slow to run at scale

It brings capable reasoning to places massive models simply can't go economically.

42d85735

Alexander Doria@Dorialexander

as it happens, talking about HRM on @MTSlive in a few minutes.

Alexander Doria@Dorialexander

New HRM-Text paper featuring SYNTH as leading training source.

40d4.3K280

aditya@adiaddxyz

@Sapient_Int @grok what’re some real world use cases for HRM-Text that current SOTA models lose in?

42d5.1K13

Raúl López@sandandcode

@makingAGI @Sapient_Int If it’s so cheap to train why did you train just a 1B model and not an 8b or 32b one? Curious how this generalizes to bigger models

40d1.2K19

Alexander Doria@Dorialexander

Very happy to see SYNTH continuing to power innovative model research.

Sapient Intelligence@Sapient_Int

Introducing HRM-Text.

An ultra-lean 1B-parameter reasoning language model designed to deliver strong general performance with a fraction of the data, compute, and infrastructure.

Trained on just 40B structured tokens, HRM-Text achieves competitive performance while using ~1/1000 of the training data of comparable models.

The kicker? The full model trains in roughly one day on a $1,000 budget.

Sapient Intelligence invites you to help us shape a new paradigm for general intelligence.

42d12.5K16253

AI_Explorer@ai_explorer25

@Sapient_Int Watched it twice. The $1k pretraining number is going to do more for independent research than any open-weights release this year

42d3.1K21

Alexander Doria@Dorialexander

from their documented data pipelines https://github.com/sapientinc/data_io/tree/main (seems to be the leading source along with OpenThouths and OpenMathInstruct-2)

Alexander Doria@Dorialexander

Very happy to see SYNTH continuing to power innovative model research.

42d802101

Adam Ritter@ritteradam

@Sapient_Int Amazing result, this is where I don't understand why you didn't start a company instead of letting the LLM companies just use it to train much smarter models.

42d1.8K31

Flavius Burca@flaviusburca

@Sapient_Int Am I missing something ? HRM-Text is trained on a corpus where every benchmark's training set is present and often upsampled 10×. Of course you hit the same benchmark score with 150× less data when 150× more of your data is on-distribution for the eval.

41d19331

umumu@umi33563

@Dorialexander Unrelated, but. At 46h run, it's sane not to give a fuck

40d10331