Docling
New
Feed PDFs, Word files, and images into your AI pipeline with layout and tables intact. IBM-built parsing that doesn't mangle your documents.
Developer Tools
★ 4.5(2,900 reviews)freeOverview
Docling is an open-source library by IBM that parses PDFs, Office files, and images into structured, AI-ready formats, preserving layout, tables, and reading order for high-quality RAG ingestion.
Key Features
- PDF and Office parsing
- Table and layout extraction
- AI-ready structured output
- Integrates with LangChain/LlamaIndex
Pros
- • Excellent table handling
- • Open source and free
- • Great for RAG ingestion
Cons
- • CPU-intensive on large docs
- • Library, not a UI
Advertisement