GitHub repositories
The PDF Association hosts several public and private repositories in GitHub to facilitate a common understanding of PDF and for developing new ideas around PDF technologies. Private repositories are restricted to PDF Association members. Public repositories welcome contributions and comments from anyone however, all contributors must first complete this form acknowledging their acceptance of the PDF Association's Intellectual Property Rights (IPR) policy.
Contents
PDF Errata
This public repository provides developers with a means of openly reporting all issues with any PDF-based ISO standard for review and resolution by industry experts. All issues in PDF specifications are important, from minor typos and formatting issues, to larger ambiguous, unclear or apparently contradictory statements. By reaching a consensus on resolutions as an industry, PDF interoperability and implementation reliability will be improved. This repo supports issues logged against all published PDF 2.0-based ISO standards:
- PDF 2.0 (ISO 32000-2:2020)
- PDF/A-4 (ISO 19005-4:2020)
- PDF/X-6 (ISO 15930-9:2020)
- PDF/VT-3 (ISO 16612-3:2020)
- ECMAScript for PDF 2.0 (ISO 21757-1:2020)
Arlington PDF model
The Arlington PDF Model is a specification-derived, machine-readable definition of the full PDF document object model (DOM) as defined by the PDF 2.0 specification ISO 32000-2:2020 and its related resolved errata. It provides an easy-to-process structured definition of all formally defined PDF objects (dictionaries, arrays and map objects) and their relationships beginning with the file trailer using a simple text-based syntax and a small set of declarative functions. The Arlington PDF Model is applicable to both PDF readers and PDF writers.
PDF 2.0 examples
This is a collection of example PDF 2.0 files that comply with ISO 32000-2:2020. The files in this collection are intended for educational purposes and are intentionally kept relatively simple. Each example illustrates the usage of a new PDF 2.0 feature.
Index of PDF corpora
This index references a number of the more significant public corpora (data sets) that may contain both valid and invalid, real and synthetic PDF files, reflecting the realities of processing PDF files 'from the wild'. In addition, targeted test suites for specific PDF features, ISO subsets of PDF and some of the nested formats used inside PDF files are also listed. It is not intended to be a list of every website where PDFs may be obtained.
Deriving HTML from PDF
This repository is for PDF Association members of the Deriving HTM from PDF TWG and PDF Reuse TWG technical working groups to track work on updates to the document describing the algorithm for deriving HTML from well-tagged PDF 2.0.
Safedocs
Artifacts from the DARPA-funded SafeDocs research program.
PDF 2.0 RichMedia annotations
This repository is for PDF Association members of the RichMedia TWG which is examining support for all forms of rich media including 3D, audio and video in PDF 2.0.