It's kind of crazy how well LiteParse does on markdown document parsing even compared against frontier VLMs - when it doesn't use VLMs or any AI/OCR models at all. It's pure code.
On ParseBench, it outperforms Qwen 3.5-9B / GLM-OCR.
There's still a gap vs. models like Gemma 4 and PaddleOCR-VL especially on dense visual outputs, but if your documents are text/table-heavy this gap closes rapidly.
Come check it out: it's the fastest document parser you can possibly use, and it's completely free/open-source.
Repo: https://github.com/run-llama/liteparse
We built the fastest PDF -> markdown parser in the world 🚀⚡️
AND it’s more accurate than any other open-source, model-free parser (pymupdf4llm, opendataloader, pdf-inspector, markitdown) on 3 standardized benchmarks: olmOCR0-bench, opendataloader-bench, ParseBench
Introducing LiteParse v2.1. The v2 base version was already the fastest document->text parser on the planet, and with this new release we’ve introduced markdown.
It is fully open-source (Apache 2.0) and free, is usable from CLI/Rust/Node/Python/WASM, and is also installable as a one-click agent skill.
Check it out: https://www.llamaindex.ai/blog/markdown-comes-to-liteparse
Come check out LiteParse: https://github.com/run-llama/liteparse
