Outclassing Frontier LLMs at Extracting Information

Description

Accurately extracting information from documents has been a decades-old dream. Important workflows — from automated back-office processing to enterprise RAG — depend on it. LLMs promise to fulfill this dream but currently fall short: they hallucinate information, struggle with long documents, and break down on complex layouts. The solution: LLMs specialized in information extraction. In this talk, I will present: - **NuExtract** — the first LLM specialized in extracting structured information (JSON output) - **NuMarkdown** — the first reasoning OCR LLM (RAG-ready Markdown output). **These low-hallucination [open-source] models outclass frontier LLMs like GPT-5 and Gemini 2.5 while being orders of magnitude smaller**, enabling private usage. I will demonstrate the abilities of these LLMs, show how to use them at scale, and discuss what’s coming next in information extraction. Name speaker 1: Speaker must be CEO / Co-founder / Founder * Etienne Bernard Job title speaker 1 * CEO and Co-founder Email address of speaker 1 * etienne@numind.ai Photo of speaker 1 * Image etienne_pic - Amruta Nandargi.jpeg Name speaker 2 Optional Job title speaker 2 Optional Email address of speaker 2 Photo of speaker 2 Tracks AI for Finance In which track(s) of the summit does your presentation fall into? * Shaping the Future of Finance with AI and Industry Leaders Digital Sovereignty and AI Regulation : Landscape and Opportunities AI and Digital Sobriety: Building High-Performance Models with a Reduced Footprint Enhancing operational efficiency through AI Building Trust with Ethical and Accountable AI Data: The Driving Force Behind the Financial Revolution Gen AI for Customer Relation AI and Financial Security: Fighting Fraud and Cyber Threats Tracks AI for Health In which track(s) of the summit does your presentation fall into? * Future of Health & Agentic AI Augmented Patients, Healthcare Professionals & Workflows 4P Medicine Research & Development Health Innovation & Ecosystem Data Management & Valorization Policy, Governance & Trust in AI Slides and Contact - AI for Health If you have your slides for the presentation ready, you can send them directly here or by mail to lyse Do not insert any special character, police or incrustation as we will combine all slides as a masterslide Let us know if you inserted a video in the slides, and if it has audio Send us a copy of your video by email as a backup Any other information you would like to share ? Optional Slides and Contact - AI for Finance / Adopt AI If you have your slides for the presentation ready, you can send them directly here or by mail to alicia.garrigoux@artefact.com Do not insert any special character, police or incrustation as we will combine all slides as a masterslide Let me know if you inserted a video in the slides, and if it has audio Send me a copy of your video by email as a backup Any other information you would like to share ? Optional Slides and Contact - AI for Industry If you have your slides for the presentation ready, you can send them directly here or by mail to clara.moschetti@artefact.com Do not insert any special character, police or incrustation as we will combine all slides as a masterslide Let me know if you inserted a video in the slides, and if it has audio Send me a copy of your video by email as a backup Any other information you would like to share ? Optional Slides and Contact - AI for the Planet / Sport / Travel / Retail & Consumers / Retail If you have your slides for the presentation ready, you can send them directly here or by mail to camille.guillard@artefact.com Do not insert any special character, police or incrustation as we will combine all slides as a masterslide Let me know if you inserted a video in the slides, and if it has audio Send me a copy of your video by email as a backup Any other information you would like to share ? Optional Submitted 14/10/2025, 17:24

PRESENTED BY