We parsed this SpaceX equity research PDF faster than the time it took for Screen Studio to zoom in 鈿★笍馃敟
liteparse is now the best open-source document parsing tool out there. There鈥檚 no reason to not use it as a first pass, even if you do have docs that require heavier VLM processing downstream.
Try it out now over any document: https://www.llamaindex.ai/liteparse-demo
Repo: https://github.com/run-llama/liteparse
We built the fastest PDF -> markdown parser in the world 馃殌鈿★笍
AND it鈥檚 more accurate than any other open-source, model-free parser (pymupdf4llm, opendataloader, pdf-inspector, markitdown) on 3 standardized benchmarks: olmOCR0-bench, opendataloader-bench, ParseBench
Introducing LiteParse v2.1. The v2 base version was already the fastest document->text parser on the planet, and with this new release we鈥檝e introduced markdown.
It is fully open-source (Apache 2.0) and free, is usable from CLI/Rust/Node/Python/WASM, and is also installable as a one-click agent skill.
Check it out: https://www.llamaindex.ai/blog/markdown-comes-to-liteparse
Come check out LiteParse: https://github.com/run-llama/liteparse