light estimateLast updated 2026-06-22

OCR API for large documents

For large documents, AWS Textract handles the biggest jobs asynchronously — up to about 3000 pages (500MB) per call — ahead of Azure (around 2000) and Google (around 200). Mistral caps lower (around 1000 pages, 50MB). These are documented async limits, not throughput or accuracy measures, and structured extraction tiers carry separate costs. A light estimate from provider documentation, June 2026.

DefaultAWS Textract (DetectDocumentText)max_grootte_asyncAWS Textract (DetectDocumentText)
Provider offerings compared on Price, Score, Max pages and capabilities
OfferingPrice ($/1000 pages)ScoreMax pagesCapabilities
Mistral OCR 3Mistral AI279.751000tableshandwritingJSON
Google Document AI (Enterprise Document OCR)Google Cloud1.5200tableshandwritingJSON
AWS Textract (DetectDocumentText)Amazon Web Services1.53000tableshandwritingJSON
Azure Document Intelligence (Read)Microsoft Azure1.52000tableshandwritingJSON
ReductoReducto*tableshandwritingJSON
LlamaParseLlamaIndex (LlamaCloud)3*tableshandwritingJSON

* token-/credit-priced — the headline understates real per-unit cost, so it is excluded from the cheapest ranking.