DoCO-FD Ontology

Document Components Ontology - Form Documents Extension

Namespace: http://purl.org/doco-fd/
Prefix: doco-fd: · Version: 1.0.0

What is DoCO-FD?

DoCO-FD is a formal OWL 2 DL extension to the Document Components Ontology (DoCO) that provides structured vocabulary for form-based documents.

While DoCO excels at modeling academic papers and books, it lacks specific classes for business forms, government documents, invoices, medical records, and other structured data entry documents. DoCO-FD fills these gaps.

DoCO-FD Architecture DoCO (Base) Section · TextBox · Table · FrontMatter · BackMatter Document Components Ontology extends DoCO-FD (Extension) FormField FormSection TableRow TableCell Header Footer + Schema.org Person · Organization Semantic Meaning RDF Graph Structured + Semantic Machine-Readable Layer Colors: Base Ontology Extension

Key Features

📝 Form Structure

FormSection, FormField, and KeyValuePair classes for structured data entry documents

📊 Table Internals

TableRow, TableCell, and TableHeader for complete table annotation (missing from base DoCO)

📄 Document Layout

Header and Footer classes extending DoCO's FrontMatter and BackMatter

🔗 Dual-Layer Markup

Works alongside Schema.org for complete semantic + structural annotation

Design Philosophy

1. Extends, Doesn't Replace

DoCO-FD is a proper extension to DoCO. All new classes are subclasses:

2. Dual-Layer Markup with Schema.org

Example: Patient Name Field

<div typeof="schema:Person doco-fd:FormField" resource="#patient">
  <span property="doco-fd:fieldLabel">Patient Name:</span>
  <span property="schema:name doco-fd:fieldValue">John Doe</span>
</div>

DoCO-FD = Document structure (it's a form field)
Schema.org = Semantic meaning (it's a person)

Classes

Form Structure

Class doco-fd:FormSection

Extends: doco:Section

A logical grouping of related form fields within a document, typically representing a coherent unit of information collection (e.g., "Personal Information", "Billing Address", "Payment Details").

<section typeof="doco-fd:FormSection" resource="#billingInfo">
  <h2 property="doco:sectionTitle">Billing Information</h2>
  <!-- form fields -->
</section>

Class doco-fd:FormField

Extends: doco:TextBox

A structured element for data entry or display, consisting of a label and a value. Form fields are the atomic units of information in form-based documents.

<div typeof="doco-fd:FormField" resource="#invoiceNumber">
  <span property="doco-fd:fieldLabel">Invoice Number:</span>
  <span property="doco-fd:fieldValue">INV-2024-001</span>
</div>

Class doco-fd:KeyValuePair

Extends: doco:TextBox

A semantic pairing of a label (key) and its associated data value, representing a single piece of structured information. Similar to FormField but emphasizes the key-value relationship pattern.

<div typeof="doco-fd:KeyValuePair" resource="#totalAmount">
  <span property="doco-fd:fieldLabel">Total Amount:</span>
  <span property="doco-fd:fieldValue">$1,234.56</span>
</div>

Document Layout

Class doco-fd:Header

Extends: doco:FrontMatter

The top section of a document containing identifying information, logos, titles, and metadata.

Class doco-fd:Footer

Extends: doco:BackMatter

The bottom section of a document containing supplementary information, disclaimers, page numbers, and legal notices.

Table Components

Class doco-fd:TableRow

A horizontal sequence of table cells representing a single record. DoCO has Table but no row concept - this fills the gap.

<tr typeof="doco-fd:TableRow" resource="#row1">
  <td typeof="doco-fd:TableCell">Product A</td>
  <td typeof="doco-fd:TableCell">$25.00</td>
</tr>

Class doco-fd:TableCell

An individual data container at the intersection of a row and column.

<td typeof="doco-fd:TableCell schema:MonetaryAmount"
    property="doco-fd:cellValue schema:value">
  $1,234.56
</td>

Class doco-fd:TableHeader

Extends: doco-fd:TableCell

A cell that provides a descriptive label for a column or row.

<th typeof="doco-fd:TableHeader" property="doco-fd:columnHeader">
  Description
</th>

Properties

Data Properties

Property Domain Range Description
doco-fd:fieldLabel FormField xsd:string Descriptive label for field
doco-fd:fieldValue FormField rdfs:Literal Data content of field
doco-fd:cellValue TableCell rdfs:Literal Content of table cell
doco-fd:columnHeader TableHeader xsd:string Column header text
doco-fd:rowIndex TableRow xsd:nonNegativeInteger Zero-based row position

Complete Example: Invoice

Full RDFa Markup Example

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:schema="http://schema.org/"
      xmlns:doco="http://purl.org/spar/doco/"
      xmlns:doco-fd="http://purl.org/doco-fd/">

<body typeof="schema:Invoice doco:BodyMatter" resource="#invoice">

  <!-- Header -->
  <header typeof="doco-fd:Header" resource="#header">
    <h1 property="doco:title">Invoice</h1>
    <div typeof="doco-fd:FormField">
      <span property="doco-fd:fieldLabel">Invoice #:</span>
      <span property="schema:identifier doco-fd:fieldValue">
        INV-2024-001
      </span>
    </div>
  </header>

  <!-- Customer Section -->
  <section typeof="doco-fd:FormSection" resource="#customer">
    <h2 property="doco:sectionTitle">Customer</h2>
    <div typeof="schema:Person doco-fd:FormField">
      <span property="doco-fd:fieldLabel">Name:</span>
      <span property="schema:name doco-fd:fieldValue">John Smith</span>
    </div>
  </section>

  <!-- Line Items Table -->
  <table typeof="doco:Table" resource="#items">
    <thead>
      <tr>
        <th typeof="doco-fd:TableHeader" property="doco-fd:columnHeader">
          Product
        </th>
        <th typeof="doco-fd:TableHeader" property="doco-fd:columnHeader">
          Price
        </th>
      </tr>
    </thead>
    <tbody>
      <tr typeof="doco-fd:TableRow">
        <td typeof="doco-fd:TableCell" property="doco-fd:cellValue">
          Widget A
        </td>
        <td typeof="doco-fd:TableCell" property="doco-fd:cellValue">
          $25.00
        </td>
      </tr>
    </tbody>
  </table>

  <!-- Total -->
  <div typeof="doco-fd:KeyValuePair schema:MonetaryAmount">
    <span property="doco-fd:fieldLabel">Total:</span>
    <span property="schema:value doco-fd:fieldValue">$1,234.56</span>
  </div>

</body>
</html>

Class Hierarchy

owl:Thing │ ├─ doco:Section │ └─ doco-fd:FormSection │ ├─ doco:TextBox │ ├─ doco-fd:FormField │ └─ doco-fd:KeyValuePair │ ├─ doco:FrontMatter │ └─ doco-fd:Header │ ├─ doco:BackMatter │ └─ doco-fd:Footer │ ├─ doco:Table │ └─ doco-fd:TableRow │ └─ doco-fd:TableCell │ └─ doco-fd:TableHeader

Usage in SDD

DoCO-FD is used throughout the Synthetic Document Dataset to provide semantic structure:

Integration: The ontology is automatically applied during document generation. Every form field in generated documents includes both structural (DoCO-FD) and semantic (Schema.org) markup.

Resources

Citation: If you use DoCO-FD in your work, please cite: Blass, D. (2026). DoCO-FD: Document Components Ontology - Form Documents Extension. Version 1.0.0. http://purl.org/doco-fd/