/Tech19d ago

Surya OCR 2 Launches With 650M Parameters and Top OCR Benchmarks

193084116323.5K

#538

Original post

Victor Sanh#538

Vik Paruchuri@VikParuchuri

Announcing Surya OCR 2:

- 650M params - 83.3% olmocr bench score (top under 3B) - 87% on internal 91-lang benchmark - 5 pages/s on RTX 5090 - Runs on CPU, GPU, MPS

9:39 AM · May 27, 2026 · 23.5K Views

Sentiment

Sentiment building, check back later.

Cluster Engagement

Posts from X

Most Activity

Vik Paruchuri@VikParuchuri

Get started with `pip install surya-ocr` and `surya_ocr file.pdf`. Needs llama.cpp (CPU) or Docker (GPU).

More info: - Model - https://huggingface.co/datalab-to/surya-ocr-2 - Github - https://github.com/datalab-to/surya - Blog post - https://www.datalab.to/blog/surya-2 - Playground - https://www.datalab.to/playground

19d35

REPLIES1

Vik Paruchuri@VikParuchuri

We see 5 pages/second on an RTX 5090 (128 concurrency), and .1 pages/s on an M1 Macbook. There are a few performance levers you can tune (see the README).

19d20

Vik Paruchuri@VikParuchuri

Surya 2 improves accuracy significantly across tables, handwriting, forms, math, layout. Here are a few examples.

19d29

Vik Paruchuri@VikParuchuri

Here are results across a few top languages. You can see the full multilingual results here - https://github.com/datalab-to/surya/blob/master/static/docs/multilingual.md .

19d10

Vik Paruchuri@VikParuchuri

Surya still makes small single-character mistakes on some languages, especially with handwriting - we're actively working on this.

And now that surya is updated, expect an update to marker soon.

19d21