PDF Association logo

Discover pdfa.org

Key resources

Get involved

How do you find the right PDF technology vendor?
Use the Solution Agent to ask the entire PDF communuity!
The PDF Association celebrates its members’ public statements
of support
for ISO-standardized PDF technology.

Member Area

PDF 2.0 support gets top billing in coverage of LibreOffice 25.8

September 29, 2025
Thank you Berlin | PDF 2.0 support gets top billing in coverage of LibreOffice 25.8 | Now Tagged PDF and PDF/UA-2 from ConTeXt | Visual AI is not “parsing” | ETSI publishes accessibility guidance for authors of standards | “Scanning a PDF is not digitization” Huh? | Saying “no” to AI text and data mining | PDFacademicBot for September 2025
PDF Association staff
About PDF Association staff

The PDF Association staff delivers a vendor-neutral platform in service of PDF’s stakeholders.

Photo of a billboard including the text
Photo of a billboard including the text


Thank you Berlin!

PDF Days Europe 2025 is in the rearview mirror, but those who attended know what others missed! The great news for everyone is that presentations were recorded. These and the slide decks, and will be uploaded to our Presentations area in the coming months.

Here are just a few of the scenes from this memorable event.

Welcome to PDF Days!

Audience for the Monday keynote.

Crowd scene at PDF Days.

A scene from an educational session.

Wide-angle shot of a poster session.

Wide-angle shot of a coffee-break.

Attendees checking out the Milestones in PDF history poster.

Poster of Milestones in the History of PDF, with post-it notes from attendees.

Attendees enjoying a coffee break.

Thanks to the speakers, sponsors and staff! All photos thanks to Catrin Wolf.

PDF 2.0 support gets top billing in coverage of LibreOffice 25.8

LIbreOffice logoCoverage of the latest LibreOffice focuses on support for features added in PDF 2.0.

Now Tagged PDF and PDF/UA-2 from ConTeXt

Context logoA popular alternative to LaTeX, ConTeXt now supports Tagged PDF, including PDF/UA-2 support! This discussion also mentions ISO/TS 32005, (also included in our sponsored no-cost PDF/UA bundle),which shows a good understanding of best practice PDF accessibility.

Coupled with LibreOffice and LaTeX (enabled by the PDF Association’s LaTex Project LWG), well done to the open source communities for taking the lead in bringing modern accessibility to documents!

Visual AI is not “parsing”

This article conflates document layout analysis (DLA) with parsing. Wikipedia’s definition of parsing is “the syntactic analysis of the input code into its component parts in order to facilitate the writing of compilers and interpreters” that “conform[s] to the rules of a formal grammar by breaking it into parts”. Using visual AI models for analysing a document's appearance is not “parsing”, as there is no grammar (formal or otherwise!).

Visual AI models definitely have their place, but are processing-intensive, can be costly to run, and, like any AI, can hallucinate. As the article describes, PDF pages must first be rendered (hopefully at sufficient resolution so small details are not lost, which might explain their comment “content of the small print description corresponding to the illustration is not recognized”!), before OCR is used. Still, for the use case of scanned documents, using visual AI model(s) can be helpful, as the article goes on to evaluate.

Applying only a visual AI model to modern “born digital” PDFs with fully extractable text, embedded fonts (where all glyph outlines exist as vectors), and Tagged PDF rich semantics makes far less sense. When optimally combined with other technologies, including “proper” parsing approaches, the combined strengths can provide significant benefits with improved performance, less errors, and reduced costs.

ETSI publishes accessibility guidance for authors of standards

As a key European standards development organization (SDO), the European Telecommunications Standards Institute (ETSI) publishes many important technical standards, with the most relevant to PDF being PAdES (PDF Advanced Electronic Signatures) as EN 319 142. Recently, ETSI published a report titled ETSI EG 204 061 v1.0.0 (2025-8): Human Factors (HF); ETSI Accessibility Strategy; Accessibility of ETSI Deliverables and Improvement of the Development Process of Deliverables" which covers accessibility challenges facing the authors of technical standards.

For many years, the PDF Association has promoted the need to make standards smarter and more accessible. As the ISO Committee Manager for ISO TC 171 SC 2, we have sought numerous derogations necessary for ISO to publish documents that are Tagged PDF or comply with PDF/UA (itself an ISO standard!). We also continue to actively work with Metanorma so that all smart standards authored with Metanorma can automatically comply with both PDF/A and PDF/UA standards. Although we welcome this report from ETSI, there is still a long way to go to ensure that technical publications from all SDOs are accessible!

“Scanning a PDF is not digitization” Huh?

Photo of a billboard including the text "PDF einscanned list keine Digitalisierung. Freie Demokraten FDP.Apparently, Germany’s Freie Demokratische Partei (FDP), as part of their quest to comprehensively digitize Germany, feels the need to point out that “PDF scanning isn’t digitization”... or so the poster reads.

What they probably meant to say: “Just scanning paper to PDF is not full digitization.”

We might also note that they post their argument as a PDF/X-4 file, so clearly they're not anti-PDF! However, this PDF, while claiming conformance with PDF/X-4 (to support high-end printing) is not even a Tagged PDF, let alone PDF/UA, so their “Argumente zum Download” isn’t going to go very far with a large population of Germans!

Saying “no” to AI text and data mining

PDF Association member Tracker Software is the first PDF viewer vendor (to our knowledge) to add explicit support for W3C’s TDMRep to their XMP interface (in version 10.7.0.398, as per https://help.pdf-xchange.com/pdfxe10/).

Screenshot of the Tracker interface showing options for TDMrep metadata.

For those wanting to add TDMRep to PDF/A-1, PDF/A-2, and PDF/A-3 documents, we also provide a free XMP Extension Schema.

Australia and New Zealand jointly adopt more ISO standards for PDF

Australia and New Zealand have completed the process to jointly adopt as national standards more ISO standards about PDF, including:

Along with previous adoptions, these new unmodified direct text adoptions enable more PDF standards to be directly referenced by Australian and New Zealand legislation and regulations.

PDFacademicBot for September 2025

Steven Bagley et al. (Sept. 2025) Proceedings of the 2025 ACM Symposium on Document Engineering. Nottingham, United Kingdom: ACM. https://dl.acm.org/doi/proceedings/10.1145/3704268

Priyanka Bhatele and Mangesh Bedekar (2025) “Smartphone Sensor Dataset for Online Reading Analysis,” in Modern Practices and Trends in Expert Applications and Security: Proceedings of MP-TEAS 2024, Volume 1. USA: Springer Nature, pp. 1–10. Google Books

Bienzeisler, J. et al. (Sept. 2025) “Embedding FHIR in Medical PDF: A Migration Path for Interoperable Documentation,” Studies in Health Technology and Informatics, 331, pp. 186–194. PubMed. https://doi.org/10.3233/SHTI251395.

Dhole, R. et al. (July 2025) “Online PDF to Text and Audio Converter and Language Translator Using Python,” International Journal of Innovative Science and Research Technology, pp. 17–24. https://doi.org/10.38124/ijisrt/25jun156.

Holubnyk, T., Dubnevych, M. and Boyarchuk, A. (July 2025) “Digital method of analyzing color parameters during PDF document verification⋆,” CyberPhyS’25: 2nd International Workshop on Intelligent & CyberPhysical Systems, p. 11. https://ceur-ws.org/Vol-4013/paper7.pdf

Karonen, K. (August 2025) PDF advanced electronic signature with smart cards. Master’s Programme in Computer, Communication and Information Sciences. Aalto University, Finland. https://aaltodoc.aalto.fi/server/api/core/bitstreams/c81b748f-c7bc-4dc6-80a9-eb6d3c15c931/content.

Klein, S.T. and Shapira, D. (2025) “Electronic Alternatives to Micro-Fiches,” in Proceedings of the Prague Stringology Conference 2025. Proceedings of the Prague Stringology Conference 2025, Prague: Czech Technical University in Prague, Czech Republic, pp. 13–25. https://www.stringology.org/cgi-bin/getfile.cgi?t=pdf&c=-&y=2025&n=03.

Kulkarni, S. et al. (2025) “Artificial Intelligence Based PDF and Document Extractor Using Retrieval Augmented Generation,” in. 1st International Conference on Lifespan Innovation (ICLI 2025), Atlantis Press, pp. 428–435. https://doi.org/10.2991/978-94-6463-831-8_52.

Kumar, A., Padath, T. and Wang, L.L. (2025) “Benchmarking PDF Accessibility Evaluation: A Dataset and Framework for Assessing Automated and LLM-Based Approaches for Accessibility Testing.” arXiv. https://doi.org/10.48550/arXiv.2509.18965.

Mittelbach, F. et al. (Sept. 2025) “Well-Tagged PDF and Universal Accessibility with LATEX,” tutorial in Proceedings of the 2025 ACM Symposium on Document Engineering. New York, NY, USA: Association for Computing Machinery (DocEng ’25), p. 1. https://doi.org/10.1145/3704268.3749107.

Mittelbach, F. et al. (Sept. 2025) “MathML and other XML Technologies for Accessible PDF from LATEX,” short paper in Proceedings of the 2025 ACM Symposium on Document Engineering. New York, NY, USA: Association for Computing Machinery (DocEng ’25), pp. 1–4. https://doi.org/10.1145/3704268.3748669.

Moreira‐Filho, J.T. et al. (Sept. 2025) “Automating Data Extraction From Scientific Literature and General PDF Files Using Large Language Models and KNIME : An Application in Toxicology,” WIREs Computational Molecular Science, 15(5), p. e70047. https://doi.org/10.1002/wcms.70047.

Nicholas, C. (Sept. 2025) “Issues in Document Security,” keynote in Proceedings of the 2025 ACM Symposium on Document Engineering. New York, NY, USA: Association for Computing Machinery (DocEng ’25), p. 1. https://doi.org/10.1145/3704268.3749109.

Obadage, R.R. et al. (2025) “Toward Robust URL Extraction for Open Science: A Study of arXiv File Formats and Temporal Trends.” arXiv. https://doi.org/10.48550/arXiv.2509.04759.

Robertsson, W. (August 2025) Development of an On-Premises PDF Report Generation Tool Using Java : Design and implementation for document generation. Bachelor Thesis, Computer Engineering. KTH ROYAL INSTITUTE OF TECHNOLOGY, SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE. https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-367993

Roseborough, A. et al. (2025) “AN INTERNATIONAL RANDOMIZED STUDY OF VIDEO VERSUS ILLUSTRATED-PDF NEUROANATOMY AND RADIOLOGY TEACHING FOR RADIATION ONCOLOGY RESIDENTS,” Radiotherapy and Oncology, 210, p. S36. https://doi.org/10.1016/S0167-8140(25)04743-7.

Salvador, A.D., Tonnang, H.E.Z. and Odindi, J. (August 2025) ‘A review on knowledge and information extraction from PDF documents and storage approaches’, Frontiers in Artificial Intelligence, 8. https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1466092/abstract.

Savaş, A. and Konyar, M. (June 2025) ‘A New Reversible Data Hiding Method in PDF Files for Secret Communication’, in. 2025 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom), Chisinau, Moldova: IEEE, p. 6. https://avesis.kocaeli.edu.tr/yayin/7864e523-91d6-4676-8e1f-b0f39214c934/a-new-reversible-data-hiding-method-in-pdf-files-for-secret-communication

Takatsume, Y., Shibao, S. and Horiguchi, T. (Sept. 2025) “Stereotactic Step-by-step Dissection of the Anterior Transpetrosal Approach: A Three-dimensional Photogrammetry-based Educational Tool for Neurosurgeons,” Neurologia medico-chirurgica, pp. 2025–0110. https://doi.org/10.2176/jns-nmc.2025-0110.

Teuscher, I.H. and Schooley, B. (2025) “Document Encryption in Practice: A Comparative Framework and Evaluation,” in Proceedings of the 2025 ACM Symposium on Document Engineering. New York, NY, USA: Association for Computing Machinery (DocEng ’25), pp. 1–4. https://doi.org/10.1145/3704268.3748682.

Wehnert, S., Changaramkulath, H. and Luca, E.W.D. (August 2025) “HiPS: Hierarchical PDF Segmentation of Textbooks.” arXiv. https://doi.org/10.48550/arXiv.2509.00909.

Xiong, J. et al. (August 2025) ‘The Implications of Insecure Use of Fonts against PDF Documents and Web Pages’, IEEE Transactions on Information Forensics and Security, pp. 1–1. https://doi.org/10.1109/TIFS.2025.3599320.


WordPress Cookie Notice by Real Cookie Banner