Bridging PDF and Web Accessibility
This update from the PDF/UA Processor LWG includes an example of its work mapping PDF’s structure types to web accessibility APIs.Background
The PDF Association’s PDF/UA Processor Liaison Working Group started work in July 2022 based on a request by ISO’s TC 171 SC 2 Working Group 9 – the same people who brought you the PDF/UA specification (ISO 14289).
Today, users who require assistive technology (AT) to read PDFs do not enjoy a similar or consistent experience across diverse hardware and software platforms. The Processor LWG was created with two main objectives in mind. The first objective is to help developers who are more familiar with web technology be able to readily understand and use PDF’s accessibility features. The second is to encourage vendors to move towards a standardized solution instead of relying on implementation-specific approaches.
Our approach thus far
Because commonly used web technologies have evolved, and now deliver generally consistent results for web pages, the Processor LWG is “borrowing those wheels,” as opposed to inventing new ones. For the past year we have focused on an examination of the various accessibility API role mappings for HTML elements and WAI-ARIA (and DPub) attributes to map these features to their functional equivalents in PDF (tags, attributes, properties, etc.).
An example of the work
This table (an Excel spreadsheet) represents a subset of the committee’s work to-date. It includes, not only the familiar tags defined by PDF 1.7 (ISO 32000-1:2008), but also others as defined in PDF 2.0 (ISO 32000-2:2020). The table lists various PDF tags (and, in some cases, attributes), identifies their analogous HTML elements and/or WAI-ARIA (and DPub) roles, and indicates their respective mappings to various industry-standard accessibility APIs, including MSAA and IAccessible2, UIA, ATK/AT-SPI, and AX API.
In some instances, the table provides two (or even three) different ways in which a PDF tag may be handled, depending on its attributes. One such example is the L (list) tag with different ListNumbering attributes.
Next steps
Our immediate next step is to continue mapping PDF’s semantic markup constructs to their respective HTML, WAI-ARIA, and DPub roles. We’ll also need to determine if processors and assistive technologies handle those things consistently, and as the author intended, across platforms. This could be a great opportunity for developers to really let their creative juices flow!
In addition, we’ll also need to consider answers to questions, such as, “When a tag includes both Alt and ActualText, what should AT do?” That’s just one question, of many, that won’t be solved simply by using the accessibility API role mapping tables. Clearly, we have more work to do!
Join us!
We invite interested developers and stakeholders to join us in our efforts! Meetings occur on the 4th Tuesday of each month at 4 pm (Eastern time, US).
Joining is easy. PDF Association members can simply login to pdfa.org, navigate to the Member Area, and sign up for the PDF/UA Processor LWG.
Non-members with expertise in the subject may also attend LWG meetings. To request access, send an email to info@pdfa.org with a few words about your interest in PDF/UA processor requirements.
We look forward to continuing our work and to you joining us in our efforts!