Presented at PDF Days Europe 2021
( 2021, Sep )

The Arlington PDF Model

A specification-derived, machine-readable definition of PDF

Session description

This talk will present the Arlington PDF Model as the first open access, vendor-neutral, comprehensive, specification-derived machine-readable definition of all formally defined PDF objects and their intra- and inter-object relationships. This represents the bulk of the latest 1,000-page ISO PDF 2.0 specification in a machine-readable text-based definition of the entire PDF DOM. It establishes a state of the art “ground truth” for future PDF research efforts and implementers. Using either trivial Linux commands, or simple scripts, or more advanced programs a multitude of potential use-cases are supported, including test case generation, extant data validation, parser generation, modelling and rapid forensic analysis of PDF syntax fragments.

Peter Wyatt
PDF Association

Slides download: https://pdfa.org/wp-content/uploads/2021/06/PDF-Days-2021-Arlington-PDF-model.pdf

Featured articles

Discover pdfa.org

Key resources

Get involved

The Arlington PDF Model

Session description