PDF Differences – one year later
In 2021, not a single tested implementation passed this test. 3 years later about three-quarters of viewers we checked now render the test case correctly.Repeatable, reliable rendering and functionality are at the very heart of PDF’s value to users.
The PDF specification is a long and complex, largely natural language description of how PDF features are rendered, so it defines how PDF documents are supposed to appear in viewers and renderers. When the PDF specification moved into the vendor-neutral, consensus-based ISO forum, many vendors were finally able to raise their concerns about requirements that were ambiguous, contradictory, or where they noticed differences with major implementations. This allowed the assembled experts from across industry to address each issue in turn, improving the definition of PDF for everyone.
BUSINESS NOTE
The PDF Association stresses the importance of proactively monitoring the public pdf-differences GitHub repository to ensure a correct understanding of PDF.
BUSINESS NOTE
The PDF Association stresses the importance of proactively monitoring the public pdf-differences GitHub repository to ensure a correct understanding of PDF.
However, not all developers were aware of these particular concerns, or were aware of the corresponding solutions and improvements made to the definition of PDF with ISO 32000-2:2020. Given both the magnitude and complexity of the PDF specification and the nuances of “ISO-ese”, many of the wording improvements are not obvious to those not actively involved in ISO standards development.
The synthetic targeted test files we provide in the public pdf-differences GitHub repository are specifically designed to make the changes in the latest PDF specification, ISO 32000-2:2020 as visually understandable as possible. Each case is built around a past issue discussed by PDF experts at the PDF Association or ISO tables and resulted in corrections or clarifications to the PDF specification (ISO 32000-2:2020 including errata set 2).
One significant contribution was made by DARPA’s SafeDocs program. In November 2021, DARPA researchers identified a significant ambiguity in the latest edition of the core PDF specification. That month, we reported on a previously overlooked ambiguity concerning inline images in which both abbreviated and full-length key names might both be present and disagree. Once this issue was resolved (see PDF errata #3, resolved with abbreviated keys having precedence), the PDF Association created a targeted synthetic test file to assess implementations. At that time, not a single tested implementation passed. 3 years later, it is gratifying to see that many (but unfortunately not all!) implementations now get it right - our latest informal testing showed that about three-quarters of viewers now render this test case correctly.
In late 2023 the PDF Association publicly released the first set of cases. Since that time additional sets of cases have been identified and publicly released, with cases including:
- Atomic Fill and Stroke - a critical difference between SVG and PDF rendering!
- Updated ColorBurn and ColorDodge blend mode formula
- Various dashed line clarifications
- Handling of Indexed color and default color spaces
- Negative font size handling
While many cases directly relate to page rendering, some cases relate to other types of important end-user functionality such as support for page labels (see this article) and testing for unknown filters (especially important in preparation for future PDF changes, see this article).
Although some implementations have made corrections or improvements in the past 12 months, it is clear that many have not – as a result, many implementations continue to fail to support PDF correctly! This leads to visual differences between implementations which can be extremely confusing and frustrating for users.
TIP: if you want to see which repositories you’re watching, visit https://github.com/watching. Click the down arrow icon for pdf-association/pdf-differences and confirm you are watching for “All activity”.
TIP: if you want to see which repositories you’re watching, visit https://github.com/watching. Click the down arrow icon for pdf-association/pdf-differences and confirm you are watching for “All activity”.
In some cases, with some interactive viewers, incorrect results appear at certain zoom levels, or when pages are scrolled, so be sure to thoroughly check for consistent appearance.
As both vendors and stakeholders continue to raise questions to the PDF Association about issues they encounter with PDF, we will continue to develop new cases to assist in achieving a common PDF appearance and core functionality. New cases may not be individually announced so all developers of PDF renderers should actively watch the pdf-differences GitHub repository for all changes/updates.
Similar to responsible reporting of cybersecurity issues, members of the PDF Association receive 60-day advance notice of all new cases, allowing corrections to be implemented before the cases become publicly available in the pdf-differences repository.
When purchasing or acquiring PDF software, these same synthetic targeted test files can also be used to assess aspects of rendering from different vendors.