PDF 2.0 support gets top billing in coverage of LibreOffice 25.8

The PDF Association staff delivers a vendor-neutral platform in service of PDF’s stakeholders.


Contents
- PDF 2.0 support gets top billing in coverage of LibreOffice 25.8
- Thank you Berlin!
- PDF 2.0 support gets top billing in coverage of LibreOffice 25.8
- Now Tagged PDF and PDF/UA-2 from ConTeXt
- Visual AI is not “parsing”
- ETSI publishes accessibility guidance for authors of standards
- “Scanning a PDF is not digitization” Huh?
- Saying “no” to AI text and data mining
- Australia and New Zealand jointly adopt more ISO standards for PDF
- PDFacademicBot for September 2025
Thank you Berlin!
PDF Days Europe 2025 is in the rearview mirror, but those who attended know what others missed! The great news for everyone is that presentations were recorded. These and the slide decks, and will be uploaded to our Presentations area in the coming months.
Here are just a few of the scenes from this memorable event.
Thanks to the speakers, sponsors and staff! All photos thanks to Catrin Wolf.
PDF 2.0 support gets top billing in coverage of LibreOffice 25.8
Coverage of the latest LibreOffice focuses on support for features added in PDF 2.0.
- In a weekly summary on 9to5linux.com and in a dedicated article
- On OMGUbuntu
- On Linuxiac
- The Register
- It’s FOSS News
- DesdeLinux
- On Neowin, David Ozondu notes that LibreOffice’s PDF 2.0 support is ahead of Microsoft Office, especially notable when it comes to AES-256-bit encryption.
- Austrian Armed Forces goes open source (LibreOffice)
Now Tagged PDF and PDF/UA-2 from ConTeXt
A popular alternative to LaTeX, ConTeXt now supports Tagged PDF, including PDF/UA-2 support! This discussion also mentions ISO/TS 32005, (also included in our sponsored no-cost PDF/UA bundle),which shows a good understanding of best practice PDF accessibility.
Coupled with LibreOffice and LaTeX (enabled by the PDF Association’s LaTex Project LWG), well done to the open source communities for taking the lead in bringing modern accessibility to documents!
Visual AI is not “parsing”
This article conflates document layout analysis (DLA) with parsing. Wikipedia’s definition of parsing is “the syntactic analysis of the input code into its component parts in order to facilitate the writing of compilers and interpreters” that “conform[s] to the rules of a formal grammar by breaking it into parts”. Using visual AI models for analysing a document's appearance is not “parsing”, as there is no grammar (formal or otherwise!).
Visual AI models definitely have their place, but are processing-intensive, can be costly to run, and, like any AI, can hallucinate. As the article describes, PDF pages must first be rendered (hopefully at sufficient resolution so small details are not lost, which might explain their comment “content of the small print description corresponding to the illustration is not recognized”!), before OCR is used. Still, for the use case of scanned documents, using visual AI model(s) can be helpful, as the article goes on to evaluate.
Applying only a visual AI model to modern “born digital” PDFs with fully extractable text, embedded fonts (where all glyph outlines exist as vectors), and Tagged PDF rich semantics makes far less sense. When optimally combined with other technologies, including “proper” parsing approaches, the combined strengths can provide significant benefits with improved performance, less errors, and reduced costs.
As a key European standards development organization (SDO), the European Telecommunications Standards Institute (ETSI) publishes many important technical standards, with the most relevant to PDF being PAdES (PDF Advanced Electronic Signatures) as EN 319 142. Recently, ETSI published a report titled ETSI EG 204 061 v1.0.0 (2025-8): Human Factors (HF); ETSI Accessibility Strategy; Accessibility of ETSI Deliverables and Improvement of the Development Process of Deliverables" which covers accessibility challenges facing the authors of technical standards.
For many years, the PDF Association has promoted the need to make standards smarter and more accessible. As the ISO Committee Manager for ISO TC 171 SC 2, we have sought numerous derogations necessary for ISO to publish documents that are Tagged PDF or comply with PDF/UA (itself an ISO standard!). We also continue to actively work with Metanorma so that all smart standards authored with Metanorma can automatically comply with both PDF/A and PDF/UA standards. Although we welcome this report from ETSI, there is still a long way to go to ensure that technical publications from all SDOs are accessible!
“Scanning a PDF is not digitization” Huh?
Apparently, Germany’s Freie Demokratische Partei (FDP), as part of their quest to comprehensively digitize Germany, feels the need to point out that “PDF scanning isn’t digitization”... or so the poster reads.
What they probably meant to say: “Just scanning paper to PDF is not full digitization.”
We might also note that they post their argument as a PDF/X-4 file, so clearly they're not anti-PDF! However, this PDF, while claiming conformance with PDF/X-4 (to support high-end printing) is not even a Tagged PDF, let alone PDF/UA, so their “Argumente zum Download” isn’t going to go very far with a large population of Germans!
Saying “no” to AI text and data mining
PDF Association member Tracker Software is the first PDF viewer vendor (to our knowledge) to add explicit support for W3C’s TDMRep to their XMP interface (in version 10.7.0.398, as per https://help.pdf-xchange.com/pdfxe10/).
For those wanting to add TDMRep to PDF/A-1, PDF/A-2, and PDF/A-3 documents, we also provide a free XMP Extension Schema.
Australia and New Zealand jointly adopt more ISO standards for PDF
Australia and New Zealand have completed the process to jointly adopt as national standards more ISO standards about PDF, including:
- AS/NZS ISO 14289.2:2025 - equivalent to ISO 14289-2:2024 (PDF/UA-2)
- SA/SNZ TS ISO 32002:2025 - equivalent to ISO/TS 32002:2023 (Extensions to digital signatures in PDF 2.0)
- SA/SNZ TS ISO 32003:2025 - equivalent to ISO/TS 32003:2023 (AES-GCM support in PDF 2.0)
- SA/SNZ TS ISO 32005:2025 - equivalent to ISO/TS 32005:2023 (DF 1.7 and 2.0 structure namespace inclusion in PDF 2.0)
- SA/SNZ TS ISO 24064:2025 - equivalent to ISO/TS 24064:2023 (STEP AP 242 support in PDF 2.0)
Along with previous adoptions, these new unmodified direct text adoptions enable more PDF standards to be directly referenced by Australian and New Zealand legislation and regulations.
PDFacademicBot for September 2025
Steven Bagley et al. (Sept. 2025) Proceedings of the 2025 ACM Symposium on Document Engineering. Nottingham, United Kingdom: ACM. https://dl.acm.org/doi/proceedings/10.1145/3704268
Priyanka Bhatele and Mangesh Bedekar (2025) “Smartphone Sensor Dataset for Online Reading Analysis,” in Modern Practices and Trends in Expert Applications and Security: Proceedings of MP-TEAS 2024, Volume 1. USA: Springer Nature, pp. 1–10. Google Books
Bienzeisler, J. et al. (Sept. 2025) “Embedding FHIR in Medical PDF: A Migration Path for Interoperable Documentation,” Studies in Health Technology and Informatics, 331, pp. 186–194. PubMed. https://doi.org/10.3233/SHTI251395.
Dhole, R. et al. (July 2025) “Online PDF to Text and Audio Converter and Language Translator Using Python,” International Journal of Innovative Science and Research Technology, pp. 17–24. https://doi.org/10.38124/ijisrt/25jun156.
Holubnyk, T., Dubnevych, M. and Boyarchuk, A. (July 2025) “Digital method of analyzing color parameters during PDF document verification⋆,” CyberPhyS’25: 2nd International Workshop on Intelligent & CyberPhysical Systems, p. 11. https://ceur-ws.org/Vol-4013/paper7.pdf
Karonen, K. (August 2025) PDF advanced electronic signature with smart cards. Master’s Programme in Computer, Communication and Information Sciences. Aalto University, Finland. https://aaltodoc.aalto.fi/server/api/core/bitstreams/c81b748f-c7bc-4dc6-80a9-eb6d3c15c931/content.
Klein, S.T. and Shapira, D. (2025) “Electronic Alternatives to Micro-Fiches,” in Proceedings of the Prague Stringology Conference 2025. Proceedings of the Prague Stringology Conference 2025, Prague: Czech Technical University in Prague, Czech Republic, pp. 13–25. https://www.stringology.org/cgi-bin/getfile.cgi?t=pdf&c=-&y=2025&n=03.
Kulkarni, S. et al. (2025) “Artificial Intelligence Based PDF and Document Extractor Using Retrieval Augmented Generation,” in. 1st International Conference on Lifespan Innovation (ICLI 2025), Atlantis Press, pp. 428–435. https://doi.org/10.2991/978-94-6463-831-8_52.
Kumar, A., Padath, T. and Wang, L.L. (2025) “Benchmarking PDF Accessibility Evaluation: A Dataset and Framework for Assessing Automated and LLM-Based Approaches for Accessibility Testing.” arXiv. https://doi.org/10.48550/arXiv.2509.18965.
Mittelbach, F. et al. (Sept. 2025) “Well-Tagged PDF and Universal Accessibility with LATEX,” tutorial in Proceedings of the 2025 ACM Symposium on Document Engineering. New York, NY, USA: Association for Computing Machinery (DocEng ’25), p. 1. https://doi.org/10.1145/3704268.3749107.
Mittelbach, F. et al. (Sept. 2025) “MathML and other XML Technologies for Accessible PDF from LATEX,” short paper in Proceedings of the 2025 ACM Symposium on Document Engineering. New York, NY, USA: Association for Computing Machinery (DocEng ’25), pp. 1–4. https://doi.org/10.1145/3704268.3748669.
Moreira‐Filho, J.T. et al. (Sept. 2025) “Automating Data Extraction From Scientific Literature and General PDF Files Using Large Language Models and KNIME : An Application in Toxicology,” WIREs Computational Molecular Science, 15(5), p. e70047. https://doi.org/10.1002/wcms.70047.
Nicholas, C. (Sept. 2025) “Issues in Document Security,” keynote in Proceedings of the 2025 ACM Symposium on Document Engineering. New York, NY, USA: Association for Computing Machinery (DocEng ’25), p. 1. https://doi.org/10.1145/3704268.3749109.
Obadage, R.R. et al. (2025) “Toward Robust URL Extraction for Open Science: A Study of arXiv File Formats and Temporal Trends.” arXiv. https://doi.org/10.48550/arXiv.2509.04759.
Robertsson, W. (August 2025) Development of an On-Premises PDF Report Generation Tool Using Java : Design and implementation for document generation. Bachelor Thesis, Computer Engineering. KTH ROYAL INSTITUTE OF TECHNOLOGY, SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE. https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-367993
Roseborough, A. et al. (2025) “AN INTERNATIONAL RANDOMIZED STUDY OF VIDEO VERSUS ILLUSTRATED-PDF NEUROANATOMY AND RADIOLOGY TEACHING FOR RADIATION ONCOLOGY RESIDENTS,” Radiotherapy and Oncology, 210, p. S36. https://doi.org/10.1016/S0167-8140(25)04743-7.
Salvador, A.D., Tonnang, H.E.Z. and Odindi, J. (August 2025) ‘A review on knowledge and information extraction from PDF documents and storage approaches’, Frontiers in Artificial Intelligence, 8. https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2025.1466092/abstract.
Savaş, A. and Konyar, M. (June 2025) ‘A New Reversible Data Hiding Method in PDF Files for Secret Communication’, in. 2025 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom), Chisinau, Moldova: IEEE, p. 6. https://avesis.kocaeli.edu.tr/yayin/7864e523-91d6-4676-8e1f-b0f39214c934/a-new-reversible-data-hiding-method-in-pdf-files-for-secret-communication
Takatsume, Y., Shibao, S. and Horiguchi, T. (Sept. 2025) “Stereotactic Step-by-step Dissection of the Anterior Transpetrosal Approach: A Three-dimensional Photogrammetry-based Educational Tool for Neurosurgeons,” Neurologia medico-chirurgica, pp. 2025–0110. https://doi.org/10.2176/jns-nmc.2025-0110.
Teuscher, I.H. and Schooley, B. (2025) “Document Encryption in Practice: A Comparative Framework and Evaluation,” in Proceedings of the 2025 ACM Symposium on Document Engineering. New York, NY, USA: Association for Computing Machinery (DocEng ’25), pp. 1–4. https://doi.org/10.1145/3704268.3748682.
Wehnert, S., Changaramkulath, H. and Luca, E.W.D. (August 2025) “HiPS: Hierarchical PDF Segmentation of Textbooks.” arXiv. https://doi.org/10.48550/arXiv.2509.00909.
Xiong, J. et al. (August 2025) ‘The Implications of Insecure Use of Fonts against PDF Documents and Web Pages’, IEEE Transactions on Information Forensics and Security, pp. 1–1. https://doi.org/10.1109/TIFS.2025.3599320.