LlamaIndex Launches LiteParse V2.1 Fastest Open-Source PDF To Markdown Parser

Original post

It's kind of crazy how well LiteParse does on markdown document parsing even compared against frontier VLMs - when it doesn't use VLMs or any AI/OCR models at all. It's pure code.

On ParseBench, it outperforms Qwen 3.5-9B / GLM-OCR.

There's still a gap vs. models like Gemma 4 and PaddleOCR-VL especially on dense visual outputs, but if your documents are text/table-heavy this gap closes rapidly.

Come check it out: it's the fastest document parser you can possibly use, and it's completely free/open-source.

Repo: https://github.com/run-llama/liteparse

Jerry Liu@jerryjliu0

We built the fastest PDF -> markdown parser in the world 🚀⚡️

AND it’s more accurate than any other open-source, model-free parser (pymupdf4llm, opendataloader, pdf-inspector, markitdown) on 3 standardized benchmarks: olmOCR0-bench, opendataloader-bench, ParseBench

Introducing LiteParse v2.1. The v2 base version was already the fastest document->text parser on the planet, and with this new release we’ve introduced markdown.

It is fully open-source (Apache 2.0) and free, is usable from CLI/Rust/Node/Python/WASM, and is also installable as a one-click agent skill.

Check it out: https://www.llamaindex.ai/blog/markdown-comes-to-liteparse

Come check out LiteParse: https://github.com/run-llama/liteparse

9:18 AM · Jun 19, 2026 · 3.8K Views