Outclassing Frontier LLMs at Extracting Information

Nov 25, 2025 | 11:00 AM - 11:10 AM

Description

Accurately extracting information from documents has been a decades-old dream. Important workflows — from automated back-office processing to enterprise RAG — depend on it. LLMs promise to fulfill this dream but currently fall short: they hallucinate information, struggle with long documents, and break down on complex layouts. The solution: LLMs specialized in information extraction. In this talk, I will present: - **NuExtract** — the first LLM specialized in extracting structured information (JSON output) - **NuMarkdown** — the first reasoning OCR LLM (RAG-ready Markdown output). **These low-hallucination [open-source] models outclass frontier LLMs like GPT-5 and Gemini 2.5 while being orders of magnitude smaller**, enabling private usage. I will demonstrate the abilities of these LLMs, show how to use them at scale, and discuss what’s coming next in information extraction.

Outclassing Frontier LLMs at Extracting Information

Nov 25, 2025 | 11:00 AM - 11:10 AM

Description

PRESENTED BY

Etienne Bernard

CEO and Co-founder

NuMind

About us

Offices

Privacy Policy

Data privacy

Cookies

Impressum

Index Égalité Professionnelle

General Terms

@ Artefact 2025