PDF Association logo

Discover pdfa.org

Key resources

Get involved

How do you find the right PDF technology vendor?
Use the Solution Agent to ask the entire PDF communuity!
The PDF Association celebrates its members’ public statements
of support
for ISO-standardized PDF technology.

Member Area

RESOURCE

Arlington PDF Model

The Arlington PDF Model is a free and open-source machine-readable data model of all PDF objects. It was originally developed under the DARPA-funded SafeDocs program and is now available via GitHub. As of June 2023 it remains under active development.

Arlington PDF Model logoDerived directly from the latest PDF 2.0 specification, ISO 32000-2:2020, and its resolved errata, the Arlington PDF Model is thus entirely vendor- and implementation neutral. It represents the full PDF document object model (DOM) in an easy-to-process structured definition (as a set of tab-separated TSV files) of all formally defined PDF objects (dictionaries, arrays and map objects) and their relationships beginning with the PDF file trailer using a simple text-based syntax and a small set of predicates (declarative statements about data integrity relationships).

The Arlington PDF Model does not replace ISO 32000-2, and must always be used in conjunction with the PDF 2.0 specification in order to fully understand the PDF DOM.

This resource is primarily intended for PDF developers to help them check their implementations and understanding, avoid malformed PDFs, and help bring PDF technology into alignment with the latest and most up-to-date definition of PDF.

The model was named “Arlington” in recognition of DARPA‘s contribution to advancing PDF technology (DARPA is headquartered in Arlington, Virginia).

Related resources


This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. HR001119C0079. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Defense Advanced Research Projects Agency (DARPA). Approved for public release.

RESOURCE INFO

Arlington PDF Model logo

Arlington PDF Model (GitHub)

The Arlington PDF Model is a free and open-source machine-readable data model of all PDF objects.

Published: June 15, 2023


PDF 2.0 Extensions to PDF 2.0

WordPress Cookie Notice by Real Cookie Banner