Why PDF 2.0 is the new PDF bible
Why PDF 2.0 is the definition of PDF every developer should use – even if your products don’t support PDF 2.0.Contents
If you speak to Peter Wyatt, the PDF Association’s CTO, he’ll tell you straight-away that PDF 2.0 is the – latest and best definition of PDF every developer should use – even if your products don’t support PDF 2.0.
If you aren’t a developer you might be saying some version of:
“Huh? How could the PDF 2.0 specification be useful if we haven’t decided to support PDF 2.0?”
Some background
Every PDF developer has one or another version of the PDF specification near-at-hand. Adobe, the company that invented PDF, first published the original PDF specification” - the PDF Reference 1.0 - in 1993, and made it freely available. In the years that followed, new freely-available editions of the PDF specification were released by Adobe, introducing new PDF versions while also fixing various mistakes or clarifying information in previous editions. Because PDF is backwards compatible, developers became used to referring to the latest PDF specification even for features related to earlier versions - “later” was always “better”!
24 years and 7 versions later, in 2017, Adobe’s PDF Reference 1.7 was still freely available, however it was no longer the latest definition of PDF.
Prompted by the success of PDF/X (2001) and PDF/A (2005), in 2007 Adobe passed the PDF 1.7 Reference to ISO, the international standards organization based in Geneva, Switzerland for continued development under the guidance of ISO. ISO assigned the document to its Technical Committee 171 SC 2 which promptly published PDF 1.7 via the ISO “fast track” process. Designated as ISO 32000-1, this first ISO standard for PDF was intended to be essentially equivalent to Adobe PDF 1.7 with mostly editorial changes to meet ISO requirements and remove obvious vendor-specific information.
Development of the next version of PDF began in 2008, immediately after publication of ISO 32000-1:2008. This next version was entirely developed in the vendor-neutral, open and consensus based forums mandated by ISO processes. For the first time, anybody could raise an issue to correct long-standing errors or seek clarification of any wording anywhere in the entire PDF specification!
As a result, many vendors highlighted ambiguities, questioned unstated assumptions, and otherwise contributed to creating the clearest and most unambiguous specification for PDF to date. Adobe proved willing to research their existing implementations and were exceptionally constructive in helping to resolve many issues raised during the community process. After nine years of meetings and thousands of technical and editorial changes, including major rewrites of several important sections, PDF 2.0 was first published as ISO 32000-2:2017; a subsequent “dated revision” was released in 2020.
Resolving the access problem
In 2017 there was a new PDF specification but there was a catch. After 24 years of having the latest PDF specification freely available, ISO’s publication had a price tag. Developers noticed, and distribution was disappointing, as the culture of PDF had always been free. PDF 1.7 was serviceable and PDF 2.0 did not yet add any “must have” features… so it was hard to spend hundreds of dollars or euros per developer to stay up-to-date on PDF standards along with everything else!
As of April 5, 2023, thanks to the sponsorship of PDF Association members Adobe, Apryse and Foxit, the situation has been rectified. ISO 32000-2 is now available at no cost, eliminating a major impediment to access.
Marketplace (mis)impressions
More important than ISO 32000’s cost, another limiting factor is the marketplace impression that…
- PDF 2.0 was a substantially new “thing”, and
- Understanding the benefits of the PDF 2.0 specification required a lot of effort
Both impressions are wrong!
Most of the changes in PDF 2.0 simply clarified ambiguities in PDF 1.7 and earlier PDF references. Although there are new features in PDF 2.0, the specification was always expressly intended to remain backwards-compatible with previous definitions. That is, no changes made in PDF 2.0 broke software that was based on previous editions of the specification.
Whenever a matter of interpretation arises, it’s likely that the PDF 2.0 specification provides more or better information than any earlier definition. “Later” is still “better”.
But at 1,000 pages ISO 32000-2 is still not perfect: the PDF Association’s errata management processes continue to clarify reported issues so that everyone has a clear and unambiguous understanding of PDF. The first 92 errata corrections will also soon be officially published by ISO as ISO 32000-2:2020 Amendment 1, but these are available now via the no-cost download, or this free XFDF file download.
PDF 2.0 also clarifies PDF 1.7… and even older versions like 1.4
PDF 2.0 was not intended to “revolutionize PDF technology” - but it did revolutionize the process in which PDF is agreed and defined. Beyond a modest set of new features, one objective of the move from PDF 1.7 to PDF 2.0 was to increase interoperability between products via open discussions in a vendor-neutral forum.
For this first open iteration of the ISO standard for PDF, therefore, the worldwide community of experts that came together in TC 171 SC 2 focussed on ensuring that the specification was comprehensively documented and more easily understandable.
Let's review some practical examples.
Example 1: How dashed lines are defined, or not
All definitions up to PDF 1.7 (including ISO 32000-1:2008) failed to fully describe how dashed lines are defined in PDF!
Background: In PDF, a line dash pattern consists of a “dash array” and a “dash phase”. Up until PDF 2.0, however, the PDF specification didn’t describe how a dash phase value less than zero should be handled, resulting in different renderings across vendors.
Resolution: This omission was identified and corrected with the PDF 2.0 specification now including a simple algorithm and example in clause 8.4.3.6 Line dash pattern. Here’s the paragraph in question, with the new requirement underlined and highlighted:
The line dash pattern shall control the pattern of dashes and gaps used to stroke paths. It shall be specified by a dash array and a dash phase. The dash array’s elements shall be numbers that specify the lengths of alternating dashes and gaps; the numbers shall be nonnegative and not all zero. The dash phase shall be a number that specifies the distance into the dash pattern at which to start the dash. If the dash phase is negative, it shall be incremented by twice the sum of all lengths in the dash array until it is positive. The elements of both the dash array and the dash phase shall be expressed in user space units.
Example 2: Continuous color for PDF blend modes
Background: The "Adobe® Supplement to ISO 32000-1 BaseVersion: 1.7 ExtensionLevel: 5” published in June 2009, shortly after the publication of PDF 1.7 as ISO 32000-1:2008, corrected the ColorBurn and ColorDodge blend mode formulae with the following note:
Note: These functions are formulated in a different way here than they are in ISO 32000-1. However, they produce the same results except in one special edge case. For ColorDodge, the special case is cb = 0 and cs = 1, where the result is now 0 instead of 1. For ColorBurn, the special case is cb = 1 and cs = 0, where the result is now 1 instead of 0. The rationale for the change is that for any given cb, the result should be a continuous function of cs.
Resolution: The adoption of this correction for both the ColorBurn and ColorDodge blend modes in PDF 2.0 now creates a continuous function of the source (topmost) color cs providing a far more natural and intuitive blending of colors for graphic artists.
Example 3: Dashing end caps at corners
Background: In many classes of documents, such as CAD, architecture and engineering drawings, dashed lines are critically important to the communication of technical information. In marketing and business documents, an incorrect appearance of dashed lines may reduce the quality and value of the communication.
Although PDF overall has a well specified and precise rendering model, there are many small technical details necessary to ensure that all implementations can achieve consistent and reliable rendering.
Differences in output between “End before bend” vs “Bend before end” order of operation
Resolution: The latest PDF 2.0 specification adds a critical sentence to the second last paragraph of subclause 8.4.3.6 "Line dash pattern":
If the end of a dashed segment coincides exactly with a join point, then the end cap is painted before the corner.
Without this clear “end before bend” statement as specified in PDF 2.0, previous PDF specifications gave no guidance on the correct appearance of dash end caps at corners (the “bend”), resulting in different rendered output between implementations.
Bend before end | End before bend |
---|---|
Conclusion
As the above examples demonstrate, the PDF 2.0 specification is the best reference to use regardless of which PDF version you intend to support. It provides a clearer understanding of what’s expected than did previous editions of the specification, and therefore, provides better guidance on how to implement PDF correctly.
Awareness and understanding of the differences between PDF 2.0 and earlier definitions is supported by a recently-announced process providing PDF Association members with previews of test case PDF files demonstrating selected changes between PDF 1.7 and PDF 2.0.