Synthetic Document Benchmark

Measuring NER accuracy on programmatically generated documents with pixel-perfect ground truth.

Project Overview

Pixel-perfect synthetic documents (invoices, tax forms, medical records, legal pleadings, etc.) with exact bounding boxes and entity labels for every field on the page.

Entity Types

Every annotation is labeled with one of 19 entity types across two categories. Standard NER types represent extractable information — names, dates, amounts, identifiers — the targets for NER model evaluation. Structural types describe document layout elements like table cells and section headings.

Standard NER PERSON ORG DATE ADDRESS AMOUNT PHONE ID EMAIL LOCATION QUANTITY URL DURATION
Structural TABLE_CELL TABLE_HEADER FIELD_LABEL TITLE SECTION_TITLE TEXT CAPTION

Dataset

Current

SDD-Nano

401docs
781pages
45Kannotations
101 templates · ~4 docs/template · 19 entity types
Planned

SDD-Small

1Kdocs
~2Kpages
~114Kannotations
101 templates · ~10 docs/template
Planned

SDD-Medium

10Kdocs
~19Kpages
~1.1Mannotations
101 templates · ~99 docs/template
Planned

SDD-Large

100Kdocs
~195Kpages
~11.4Mannotations
101 templates · ~990 docs/template

Template Gallery

Open Template Inspector →

All 257 document templates in the benchmark. Use the toggle to switch views. Click a card label to expand details.

Baseline: DeepSeek-OCR-2 + Qwen3-8B

OCR via DeepSeek-OCR-2 (vision model), NER via Qwen3-8B (prompted, zero-shot). No fine-tuning.

Entity TypeTPFPFNPrecisionRecallF1F1
PHONE27436230.8840.9230.903
PERSON6151191080.8380.8510.844
DATE1,2563184970.7980.7170.755
ORG5101982260.7200.6930.706
ID8942556530.7780.5780.663
AMOUNT1,9386441,8430.7510.5130.609
ADDRESS3585501270.3940.7380.514