Chaos and Order

Chaos and Order https://www.youngju.dev/blog 천천히 올바르게. AI Researcher & DevOps Engineer Youngju's tech blog. GPU/CUDA, LLM, MLOps, Kubernetes AI workloads, distributed training, and data engineering. ko fjvbn2003@gmail.com (Youngju Kim) fjvbn2003@gmail.com (Youngju Kim) Fri, 15 May 2026 00:00:00 GMT https://www.youngju.dev/blog/culture/2026-05-15-document-ai-ocr-2026-mistral-ocr-marker-surya-llamaparse-docling-deep-dive.en Document AI / OCR in 2026 — Mistral OCR / Marker / Surya / LlamaParse / Docling / OlmoOCR Deep Dive https://www.youngju.dev/blog/culture/2026-05-15-document-ai-ocr-2026-mistral-ocr-marker-surya-llamaparse-docling-deep-dive.en Document AI in 2026 is no longer "extract text with Tesseract." Purpose-built APIs like Mistral OCR (March 2025), open-source PDF-to-Markdown engines like Marker / Surya / Docling / OlmoOCR, pretrained document models like LayoutLMv3 and Donut, and multimodal LLMs like Pixtral 12B and Florence-2 all attack the same problem (pulling structure and meaning out of scanned PDFs) with different bets. This guide arranges 13 candidates across four stages (OCR / layout / extraction / RAG ingestion) and tells you what to pick for invoices, contracts, papers, and RAG. Fri, 15 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) ocrdocument-aipdfmistral-ocrmarkersuryallamaparsedoclingolmoocrnougattesseractlayoutlmdonutrag2026deep-diveenglish https://www.youngju.dev/blog/culture/2026-05-15-document-ai-ocr-2026-mistral-ocr-marker-surya-llamaparse-docling-deep-dive.ja 文書 AI / OCR 2026 — Mistral OCR / Marker / Surya / LlamaParse / Docling / OlmoOCR 徹底ガイド https://www.youngju.dev/blog/culture/2026-05-15-document-ai-ocr-2026-mistral-ocr-marker-surya-llamaparse-docling-deep-dive.ja 2026 年の文書 AI はもう「Tesseract でテキストを取り出す」ではない。Mistral OCR(2025 年 3 月)などの専用 API、Marker / Surya / Docling / OlmoOCR などのオープンソース PDF-to-Markdown エンジン、LayoutLMv3・Donut のような事前学習済みドキュメントモデル、Pixtral 12B・Florence-2 のようなマルチモーダル LLM がすべて同じ問題(スキャン PDF から構造と意味を取り出す)に異なるアプローチで挑んでいる。この記事は 13 候補を OCR / レイアウト / 抽出 / RAG-ingestion の 4 段階で整理し、請求書・契約書・論文・RAG それぞれに何を選ぶべきかまで書く。 Fri, 15 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) ocrdocument-aipdfmistral-ocrmarkersuryallamaparsedoclingolmoocrnougattesseractlayoutlmdonutrag2026deep-dive日本語 https://www.youngju.dev/blog/culture/2026-05-15-document-ai-ocr-2026-mistral-ocr-marker-surya-llamaparse-docling-deep-dive 문서 AI / OCR 2026 — Mistral OCR / Marker / Surya / LlamaParse / Docling / OlmoOCR 심층 가이드 https://www.youngju.dev/blog/culture/2026-05-15-document-ai-ocr-2026-mistral-ocr-marker-surya-llamaparse-docling-deep-dive 2026년의 문서 AI는 더 이상 "Tesseract로 텍스트 뽑기"가 아니다. Mistral OCR(2025.3) 같은 전용 API, Marker / Surya / Docling / OlmoOCR 같은 오픈소스 PDF-to-Markdown 엔진, LayoutLMv3·Donut 같은 사전학습 문서 모델, Pixtral 12B·Florence-2 같은 멀티모달 LLM이 모두 같은 문제(스캔된 PDF에서 구조와 의미를 뽑기)에 다른 접근으로 달려든다. 이 글은 13개 후보를 OCR / 레이아웃 / 추출 / RAG-ingestion 네 단계로 정렬하고, 청구서·계약서·논문·RAG 각각에 무엇을 골라야 할지까지 정리한다. Fri, 15 May 2026 00:00:00 GMT fjvbn2003@gmail.com (Youngju Kim) ocrdocument-aipdfmistral-ocrmarkersuryallamaparsedoclingolmoocrnougattesseractlayoutlmdonutrag2026deep-dive