Planet Earth with the light of cities illuminating Europe and the Middle East.
New large-scale PDF corpus now publicly available

This new corpus – nearly 8 million PDFs totaling about 8 TB – was gathered from across the web in July/August of 2021.



A day to think about the PDF standard…

callas software’s Dietrich von Seggern describes what “owned by ISO” really means, and why that matters, in celebration of World Standards Day.

World Standards Day graphics

PDF standards explained: with a focus on the newest

Dietrich von Seggern from callas software explains PDF standards with a focus on the newest standards.

computer screen listing PDF validation items

PDF’s popularity online

According to CommonCrawl’s detected MIME type PDF is the 3rd most popular file-format on the web (behind HTML and XHTML); more popular than JPEG, PNG or GIF files.

Chart depicting the popularity of digital document formats online.

Prepress automation in label printing

Dietrich von Seggern describes optimal uses of automation in label printing and discusses the integration of print processing steps in PDF production workflows.

blurred office building with fire hydrant in front and Print Matters imposed on top of the picture

Overhauling PDF Forms technology

The Technical Working Group’s chair explains the current missions of the PDF Forms TWG; a declarative model for business logic, alignment with web forms and optimized accessibility.

Electronic form with fields for first name, last name, and your email with the PDF Association logo.

Busting the Myth that PDF Cannot Be Accessible

PDF documents can be as accessible as web pages if presented correctly.

Sunrise over mountains and a lake.

Linking research and industry

To assist both academic and industry researchers achieve high-quality and accurate PDF-oriented research outcomes, the PDF Association is now offering a new free peer-review service.

Man walking through a maze

Doing PDF right

Doing PDF right means being aware of errata for PDF 2.0. With over 65 issues now resolved and approved by the PDF Association’s PDF TWG, the body of knowledge in the “PDF Issues” GitHub repo is already significant.

Screenshot of a PDF issue 9.8 Font descriptors some text crossed out other in green that is being inserted

Reporting issues on PDF 2.0-related standards

The PDF Association expands its pdf-issues GitHub repo to cover all ISO standards based on PDF 2.0.

Github, PDF Association and ISO logos

SafeDocs’ latest research into securing PDF

The LangSec IEEE S&P Workshop this year includes several PDF-related presentations resulting from SafeDocs research including a paper by CTO Peter Wyatt.

Arlington PDF Model logo