Sapient Intelligence releases HRM-Text, a 1B-parameter reasoning model trained on 40B structured tokens that reaches competitive performance with one-thousandth the data volume of comparable systems
Full training finished in one day for under $1,000.
Very happy to see SYNTH continuing to power innovative model research.

from their documented data pipelines https://github.com/sapientinc/data_io/tree/main (seems to be the leading source along with OpenThouths and OpenMathInstruct-2)
Very happy to see SYNTH continuing to power innovative model research.
SCALING ISN’T EVERYTHING
Another tiny model breaking the rule. -trained on less than 1/1000th of the data - can be trained in a single day with <1000 USD
Human knowledge base ca be compressed & retrieved much tighter than LLMs do today.