Examining PDFs as an aggregate
What can thousands of PDFs tell us?
Session description
PDFs is not a single application’s file format, it is a file format shared with thousands of applications, and each PDF producer seemingly has their own quirks for generating PDFs. What can we learn by examining a couple of largish collections of PDF as an aggregate sample of PDFs?
Within these file sets: which are the most common PDF producers? What PDF version is most common? How common are errors? What are the most common errors? How common is PDF tagging? how common is PDF/A? What do these answers mean?
Presentations on other tracks at the same time
None