PDF Association logo

Discover pdfa.org

Key resources

Get involved

How do you find the right PDF technology vendor?
Use the Solution Agent to ask the entire PDF communuity!
The PDF Association celebrates its members’ public statements
of support
for ISO-standardized PDF technology.

Member Area

Questions and Answers about Tagged PDF

"Tagged PDF" (see ISO 32000-2, clause 14.8 for the current technical definition) was introduced with PDF 1.4 in 2000.

Like HTML tags (e.g., <h2>, <p>) Tagged PDF allows PDF authors to make richer PDF files that go beyond text and images on pages to establish the intended reading order and indicate the semantic intent of each piece of content (e.g., , List, Heading, Table).

This resource provides authoritative answers to common questions about Tagged PDF, especially regarding its utility for reuse and accessibility.

It is maintained by the PDF Association's PDF/UA TWG and PDF Reuse TWG.

Specifications & Standards

Q: What is Tagged PDF?

A tag tree with question-marks.A: Tagged PDF is a feature introduced in PDF 1.4 (2001) to enable rich context extraction, reuse, and accessibility, and was extended in subsequent versions of the PDF specification.

Tagged PDF ensures predictable reading order and document navigation through the embedding of semantic structures such as Lists, Headings, Tables, and Figures. These structures are represented in PDF via a mechanism colloquially known as “tags”. Tags allow software to handle content appropriately in a reuse context, (e.g., presenting an alternate, possibly reflowing, view of a page), or with assistive technology such as in conjunction with a screen reader, magnifier, Braille, or a sip-and-puff device.

Q: What is ISO 32000-1 (PDF 1.7)?

A: PDF 1.0 to PDF 1.7 were early versions of the PDF specification published by Adobe. PDF 1.7 was standardized as ISO 32000-1 in 2008.

In addition to defining all aspects of PDF, PDF 1.7 specified Tagged PDF with the core structures necessary for PDF files to be accessible.

Q: What is ISO 14289-1 (PDF/UA-1)?

A: PDF/UA-1 (ISO 14289-1, published in 2014) is an ISO standard defining accessible PDF for files based on PDF 1.7.

You can get PDF/UA-1 (ISO does not allow the PDF Association to make PDF/UA freely available) from the PDF Association or from ISO.

Q: What is ISO 32000-2 (PDF 2.0)?

A: PDF 2.0 is the latest version of the PDF specification. It was developed by experts from across the worldwide PDF industry via the ISO working group process. First published as ISO 32000-2 in 2017, PDF 2.0 was updated in 2020, with substantial updates to the section on Tagged PDF, including the addition of new structure elements and attributes. It also included other features relevant to accessibility, such as support for MathML, associated files, namespaces, structure destinations, and more.

Thanks to its industry sponsors, PDF 2.0 is available at no cost from the PDF Association. ISO and ISO member bodies continue to sell the document.

Q: What is ISO TS 32005?

A: ISO 32000-2 did not specify restrictions and inclusion rules for the standard structure namespace for PDF 1.7 in PDF 2.0 files (see “What are namespaces?” below). The absence of clear rules regarding the interaction of the PDF 1.7 namespace and PDF 2.0 namespace caused ambiguity in the creation of tagged PDF documents conforming to ISO 32000-2. The primary purpose of ISO TS 32005 is to extend the rules already defined within ISO 32000-2 to clarify how PDF 1.7 and PDF 2.0 structure elements coexist in the same document.

Q: What is “Well Tagged PDF” (WTPDF)?

A: WTPDF specifies the use of Tagged PDF for the purposes of content reuse and accessibility in PDF 2.0 files. The document was developed by the PDF Association’s PDF/UA and PDF Reuse Technical Working Groups in conjunction with ISO TC 171/SC 2/WG 9. As WTPDF includes the same normative requirements as PDF/UA-2, files that conform to WTPDF’s accessibility conformance level may also be marked as conforming to PDF/UA-2.

WTPDF’s provisions are based on ISO 32000-2 (PDF 2.0), so WTPDF cannot be used with PDF 1.7 or earlier files. To provide interoperability and consistent usage for PDF 2.0 files combining the PDF 1.7 and PDF 2.0 namespaces in the same PDF file, WTPDF mandates conformance with ISO TS 32005.

PDF files conforming to this specification must include a PDF Declaration that serves as the author’s statement that the file meets the specification’s conformance levels for reuse and/or accessibility.

Q: What is PDF/UA-2?

A: ISO 14289-2 (PDF/UA-2), published in 2024, is an ISO standard defining accessible PDF for PDF 2.0 files conforming to ISO 32000-2. PDF/UA-2’s file format requirements are identical to those of WTPDF’s conformance level for accessibility. Accordingly, a PDF file that conforms with WTPDF’s conformance level for accessibility thereby automatically conforms to the rules of PDF/UA-2, and thus can also be marked as a conforming PDF/UA-2 file.

PDF/UA-2’s provisions are based on ISO 32000-2 (PDF 2.0), so PDF/UA-2 cannot be used with PDF 1.7 or earlier files. To provide interoperability and consistent usage for PDF 2.0 files combining the PDF 1.7 and PDF 2.0 namespaces in the same PDF file, PDF/UA-2, like WTPDF, mandates conformance with ISO TS 32005.

Additionally, by complying with WTPDF’s conformance level for reuse, PDF/UA-2 files can improve the reusability of content without impacting accessibility.

Q: What is “Deriving HTML from PDF”?

A: The PDF Association’s specification for Deriving HTML from PDF defines an algorithm designed to consistently generate compliant HTML from a WTPDF. It demonstrates how these PDF files can be effectively and reliably repurposed as HTML, ensuring uniform conversion that supports better user experiences and content reuse.

Relationships between standards

Q: How are PDF versions related to PDF/UA and WTPDF?

A: See the following table.

PDF/UA-1 WTPDF & PDF/UA-2
PDF 1.7 Yes Need to first upgrade to PDF 2.0.
PDF 2.0 No Yes

Q: Is PDF/UA-2 an update of PDF/UA-1?

A: No. ISO 14289 (PDF/UA) is a family of standards addressing accessible PDF.

ISO 14289-1 (PDF/UA-1) is the ISO standard for accessible PDF files written against ISO 32000-1 (PDF 1.7), for PDF 1.x files.

ISO 14289-2 (PDF/UA-2) is the ISO standard for accessible PDF files written against ISO 32000-2 (PDF 2.0), for PDF 2.0 files.

Q: What is the relationship between WTPDF, PDF/UA and WCAG 2.x?

A: WCAG establishes generic accessibility norms for web technologies (including PDF) focused on end-user outcomes, whereas WTPDF and PDF/UA focus entirely on constructing PDF files for reusability and accessibility.

As such, WCAG, WTPDF and PDF/UA are entirely complementary:

  • WCAG provides requirements regarding content accessibility;
  • WTPDF and PDF/UA provide requirements that ensure accessibility in the PDF context.

The best practice for document authors producing PDF files is to:

  • consider WCAG’s requirements when designing and creating content;
  • use software capable of meeting WTPDF and PDF/UA requirements to produce the PDF files.

Questions about the format(s)

Q: Why would I use WTPDF and PDF/UA-2 instead of PDF/UA-1?

A: PDF 2.0 introduces new capabilities providing specific solutions for the following types of content:

  • Math
  • Fragments of documents
  • Headings which skip levels
  • More than 6 levels of headings
  • Sub-divisions of block elements (e.g., lines of code)
  • Documents including “side” content
  • Documents with both titles and headings
  • Lists separated in sections with other content between list items
  • Links targeting headings or other content
  • Content that uses emphasis (e.g., <strong>)
  • Page numbers, line numbers, Bates numbers
  • Redactions
  • Watermarks

WTPDF and/or PDF/UA-2 are required to take advantage of these PDF 2.0 capabilities in a consistent reusable and accessible manner.

In addition, WTPDF and PDF/UA-2 add comprehensive new rules for reuse and accessibility of existing PDF features, including layout attributes and annotations, that PDF/UA-1 did not fully address.

Q: I heard that PDF 2.0 removes a bunch of useful tags. Is that true?

A: It’s false! All the tags defined in PDF 1.7 are allowed in a PDF 2.0 file, including WTPDF and PDF/UA-2 files. In fact, the PDF 1.7 tag set is the default namespace for use in PDF 2.0 files. The PDF 2.0 tag set adds new structure element types; the two tag sets can be used together (see “What is ISO TS 32005 about?”).

Q: What are namespaces?

A: PDF 2.0 introduced the concept of namespaces for sets of PDF tags to enable rich interoperability.

PDF 2.0 defined two standard namespaces for Tagged PDF:

  • the “standard structure namespace for PDF 1.7” - informally known as the “PDF 1.7 tag set” or just the “PDF 1.7 tags” is the default in PDF 2.0;
  • the “standard structure namespace for PDF 2.0” - informally known as the “PDF 2.0 tag set” or just the “PDF 2.0 tags”.

PDF 2.0 also introduced the concept of domain-specific namespaces. It defined MathML as one domain-specific namespace to facilitate the inclusion of MathML tags in PDF 2.0 files.

PDF allows for custom namespaces to facilitate introduction of richer semantics. One use case already being explored using this capability is to accommodate richer semantics for STEM.

Compared to the PDF 1.7 tag set, the PDF 2.0 tag set includes new tags, includes some existing tags, and redefines some tags. For example:

  • The DocumentFragment tag was added to the PDF 2.0 tag set; it does not exist in PDF 1.7;
  • The Part tag has expanded usage in the PDF 2.0 tag set;
  • BlockQuote is a tag in the PDF 1.7 tag set, but not in the PDF 2.0 tag set.

Q: How are accessibility of content, the PDF file version, and PDF tag set versions related?

A: Content in a PDF file will be accessible to the extent that standard tags are used appropriately, including role mapping non-standard tags to standard tags.

PDF 1.4 initially introduced Tagged PDF, including a set of standard types of tags and attributes. PDF 1.5-1.7 introduced a few additions to that tag set and to Tagged PDF in general. As explained above, PDF 2.0 defines a new set of tags while maintaining the previously defined tags from PDF 1.7 as the default tag set. Thus, the PDF version identifies the tags that are listed as standard in the corresponding version of the PDF specification. PDF/UA-1 uses the PDF 1.7 tag set, even for PDF files with earlier versions.

In practice at the time of this writing, the PDF 1.7 tag set is the set of standard tags to use for all PDF 1.x files. For WTPDF and PDF/UA-2 files, the standard set of tags is a combination of the PDF 1.7 and PDF 2.0 tag sets, with their relationships as defined by ISO TS 32005.

Q: Can a Tagged PDF be accessible without conforming to PDF/UA or WTPDF?

A: Although WTPDF’s conformance level for accessibility and/or PDF/UA defines the “gold standard” for accessible PDF, many documents can be made accessible by following the principles established in these specifications without explicitly identifying their conformance to WTPDF or PDF/UA.

Q: Can a PDF file be accessible without being Tagged?

A: No. Some software will attempt to make a PDF file accessible while ignoring tags, relying on its own interpretation of the page layout, fonts and other characteristics to determine the content’s semantic structures. Although these attempts may be partially or even wholly successful on simpler documents they will always represent the software’s interpretation instead of the author’s intention.

Accessibility workflows

Q: How do I know if a document claims to conform to WTPDF or PDF/UA?

A: Both the WTPDF specification and PDF/UA standards define requirements for conforming PDF files to include conformance information in the file’s XMP metadata.

Many software applications provide access to the document XMP metadata via a file properties dialog. Some will also present conformance claims via other means (e.g., a bar above the page, or a sidebar).

Q: How do I know that a PDF file actually conforms to either WTPDF or PDF/UA?

A: Conformance claims in PDF files do not guarantee that a document truly conforms to the respective specification.

PDF accessibility validators exist, some of which provide a set of automated and manual checks to help verify conformance.

Before relying on validation software it is important to understand the difference between automated and manual checks. The Matterhorn Protocol establishes PDF industry guidance for both automated (so-called “machine” checks) and manual validation checks.

Not all validation software is capable of validating PDF/UA-1, WTPDF, and PDF/UA-2, or will necessarily be aware of its own limitations. Examine the software’s compatibility claims carefully to ensure you understand both what specifications and standards they claim to support.

Q: Can I use PDF 1.7 tags in a PDF 2.0 file?

A: Yes - ISO TS 32005 specifies how tags defined in the PDF 1.7 tag set can be used in a PDF 2.0 file, along with tags defined in the PDF 2.0 tag set. Both WTPDF and PDF/UA-2 are based on ISO TS 32005.

Q: Do I need special software to open WTPDF or PDF/UA documents?

A:  No.  Any PDF 2.0 software can open a Tagged PDF file. However, to obtain the benefits of Tagged PDF you need software that leverages Tagged PDF.

To get the benefit of PDF/UA-1 users will need software that supports PDF 1.7 and PDF/UA-1. To get the benefit of Well-Tagged PDF or PDF/UA-2 users will need software that supports PDF 2.0 and PDF/UA-2.

Q: Can I convert a PDF/UA-1 file to WTPDF and/or PDF/UA-2?

A: Yes. If you have a file that includes content that would benefit from the features introduced in PDF 2.0, such as math (see Why would I use WTPDF or PDF/UA-2?), then it may be worth investing in creating a new PDF/UA-2 compliant file.

Q: If I use a WTPDF or PDF/UA-2 file with software that doesn’t support these specifications, what should I expect?

A: Both WTPDF and PDF/UA-2 are based on PDF 2.0, so to obtain the benefits of tagged PDF in these files your software must support ISO 32000-2 (PDF 2.0).

If a Well-Tagged PDF file or a PDF/UA-2 file is opened with software that doesn’t use these specifications then some of the accessibility and reuse features might not be available, and the understanding of the PDF 1.7 tag set could be different, resulting in a degraded experience and/or different behavior.

Q: Can I use WTPDF and PDF/UA with PDF/A?

A: Yes. For a PDF file to conform to PDF/A as well as WTPDF and/or PDF/UA-2, it is required to use PDF/A-4 (the archival PDF format for PDF 2.0).  PDF/UA-1 is compatible with PDF/A-1, PDF/A-2 or PDF/A-3 at conformance level A.

Q: Where can I learn more?

A: The PDF Association’s PDF/UA Technical Working Group has produced both the Matterhorn Protocol, a freely available list of checks matching the requirements of PDF/UA-1, and the Tagged PDF Best Practice Guide: Syntax, which provides extensive guidance on the use of tagged PDF.

Visit the PDF Association’s product showcase to see a list of members who have chosen to indicate their support for PDF/UA. You may also want to join the Understanding PDF/UA or Accessible PDF groups on LinkedIn. For more information and resources visit pdfa.org/accessibility.

Appendix

The following specifications or standards relate to accessibility using Tagged PDF:

Document Publication date Role regarding Tagged PDF
PDF 1.4 2001 The first definition of Tagged PDF.
PDF 1.5-1.7 2003-2006 Incremental additions to Tagged PDF.
ISO 32000-1 2008 The ISO-standardized publication of PDF 1.7, the basis for PDF/UA-1.
ISO 14289-1 (PDF/UA-1) 2014 Defines requirements for universal accessibility of PDF 1.x files.
ISO 32000-2 (PDF 2.0) 2017-2020 A major rewrite of Tagged PDF, including several new features.
Matterhorn Protocol 1.1 2021 A comprehensive list of all the possible ways to fail PDF/UA-1.
Deriving HTML from PDF 2019 A reproducible algorithm for producing predictable HTML output from Tagged PDF.
ISO TS 32005 2023 Clarifies how to use 1.7 tags in PDF 2.0 files
Well Tagged PDF (WTPDF) 2024 Defines Tagged PDF requirements for purposes of either technical reuse of content or universal accessibility of PDF 2.0 files.
ISO 14289-2 (PDF/UA-2) 2024 Defines Tagged PDF requirements for universal accessibility of PDF 2.0 files. Formal name: ISO 14289-2.
WordPress Cookie Notice by Real Cookie Banner