/Tech4h ago

Baidu open-sources Unlimited-OCR, a 3B-parameter model that parses 40-page PDFs with a constant KV cache

The model achieved state-of-the-art performance on OmniDocBench.

156684221249.5K

#33

Original post

Susan Zhang@suchenzang#84inTech

this is what open-source looks like

Baidu Inc.@Baidu_Inc

3B total parameters & 500M activated, yet powerful enough to transcribe 40+ pages in one pass while keeping context intact. Meet Unlimited OCR!

8:40 AM · Jun 23, 2026 · 38.6K Views

Sentiment

Users are excited about Baidu's Unlimited-OCR release because its compact 3B size enables affordable self-hosting instead of costly API calls.

Pos

100.0%

Neg

0.0%

3 comments with sentiment.

Cluster Engagement

Digg Deeper

No Digg Deeper questions have been answered for this story yet.

Related links

baidu/Unlimited-OCR · Hugging Face

HUGGINGFACEVia

#33

Posts from X

Most Activity

VIEWS6.2KBOOKMARKS38LIKES61RETWEETS6REPLIES7

AK@_akhaliq

Baidu just released Unlimited-OCR

1h6.2K6138

Susan Zhang@suchenzang

this is what enterprise-saas-maxxing looks like

Niels Rogge@NielsRogge

Mistral claims SOTA performance on OlmOCRBench, a popular optical character recognition benchmark, but that isn't the case.

We have a public leaderboard on @huggingface, where Mistral OCR 4 currently ranks #3, behind open models like Chandra OCR 2 by @datalabto

4h2.9K344

AK@_akhaliq

https://huggingface.co/baidu/Unlimited-OCR

AK@_akhaliq

Baidu just released Unlimited-OCR

1h1.7K66

veloX@veloXxn

@_akhaliq Unlimited OCR'ın dil modeli ile çalışmasının asıl avantajı burada — kütüphaneler ve arşivlerde yığın halinde yanlış tanınan belgeleri toplu düzeltebiliyor, önceki OCR motorlarının %30-40'lık hata oranını çok altına çekiyor.

1h69