Outclassing Frontier LLMs at Extracting Information
Description
Accurately extracting information from documents has been a decades-old dream. Important workflows — from automated back-office processing to enterprise RAG — depend on it.
LLMs promise to fulfill this dream but currently fall short: they hallucinate information, struggle with long documents, and break down on complex layouts.
The solution: LLMs specialized in information extraction.
In this talk, I will present:
- **NuExtract** — the first LLM specialized in extracting structured information (JSON output)
- **NuMarkdown** — the first reasoning OCR LLM (RAG-ready Markdown output).
**These low-hallucination [open-source] models outclass frontier LLMs like GPT-5 and Gemini 2.5 while being orders of magnitude smaller**, enabling private usage.
I will demonstrate the abilities of these LLMs, show how to use them at scale, and discuss what’s coming next in information extraction.
Name speaker 1:
Speaker must be CEO / Co-founder / Founder
*
Etienne Bernard
Job title speaker 1
*
CEO and Co-founder
Email address of speaker 1
*
etienne@numind.ai
Photo of speaker 1
*
Image
etienne_pic - Amruta Nandargi.jpeg
Name speaker 2
Optional
Job title speaker 2
Optional
Email address of speaker 2
Photo of speaker 2
Tracks AI for Finance
In which track(s) of the summit does your presentation fall into?
*
Shaping the Future of Finance with AI and Industry Leaders
Digital Sovereignty and AI Regulation : Landscape and Opportunities
AI and Digital Sobriety: Building High-Performance Models with a Reduced Footprint
Enhancing operational efficiency through AI
Building Trust with Ethical and Accountable AI
Data: The Driving Force Behind the Financial Revolution
Gen AI for Customer Relation
AI and Financial Security: Fighting Fraud and Cyber Threats
Tracks AI for Health
In which track(s) of the summit does your presentation fall into?
*
Future of Health & Agentic AI
Augmented Patients, Healthcare Professionals & Workflows
4P Medicine
Research & Development
Health Innovation & Ecosystem
Data Management & Valorization
Policy, Governance & Trust in AI
Slides and Contact - AI for Health
If you have your slides for the presentation ready, you can send them directly here or by mail to lyse
Do not insert any special character, police or incrustation as we will combine all slides as a masterslide
Let us know if you inserted a video in the slides, and if it has audio
Send us a copy of your video by email as a backup
Any other information you would like to share ?
Optional
Slides and Contact - AI for Finance / Adopt AI
If you have your slides for the presentation ready, you can send them directly here or by mail to alicia.garrigoux@artefact.com
Do not insert any special character, police or incrustation as we will combine all slides as a masterslide
Let me know if you inserted a video in the slides, and if it has audio
Send me a copy of your video by email as a backup
Any other information you would like to share ?
Optional
Slides and Contact - AI for Industry
If you have your slides for the presentation ready, you can send them directly here or by mail to clara.moschetti@artefact.com
Do not insert any special character, police or incrustation as we will combine all slides as a masterslide
Let me know if you inserted a video in the slides, and if it has audio
Send me a copy of your video by email as a backup
Any other information you would like to share ?
Optional
Slides and Contact - AI for the Planet / Sport / Travel / Retail & Consumers / Retail
If you have your slides for the presentation ready, you can send them directly here or by mail to camille.guillard@artefact.com
Do not insert any special character, police or incrustation as we will combine all slides as a masterslide
Let me know if you inserted a video in the slides, and if it has audio
Send me a copy of your video by email as a backup
Any other information you would like to share ?
Optional
Submitted 14/10/2025, 17:24