PDF Association logo

Discover pdfa.org

Key resources

Get involved

How do you find the right PDF technology vendor?
Use the Solution Agent to ask the entire PDF communuity!
The PDF Association celebrates its members’ public statements
of support
for ISO-standardized PDF technology.

Member Area

Developers must validate!

Devs must VALIDATE before applying ISO-standardized subset metadata to PDFs! | PDF Week Prague accommodations | OPF patches JHOVE | ISO TC 130 WG 2 focuses on print-related standards | The Ghent Workgroup offers seminars | The PDFacademicBot for August 2024!
About the author: The PDF Association staff delivers a vendor-neutral platform for PDF’s stakeholders, facilitating the development of open specifications and ISO standards for PDF technology. The staff are located in Germany, the … Read more

PDF Week Prague accommodation options

It’s time to get ready for PDF Week in Prague! During the week November 4-8 2024 the PDF Association will be hosting face-to-face technical working meetings alongside ISO TC 171 SC 2 working group meetings as per the published agenda. All meetings will be at the Faculty of Mathematics and Physics at Charles University in beautiful Prague. We’ve posted a list of hotels within a 5-10 minute walk of this venue.

On Tuesday evening the group (many of whom spend PDF Week working on standards for PDF accessibility) will take in Prague’s Invisible Exhibition and enjoy an Invisible Dinner. On Friday afternoon, following the close of formal business, the PDF Association will provide an educational / discussion forum about PDF technology for the students and faculty of Charles University, our host for this meeting.

If you wish to attend in person or virtually be sure to register.

Developers must validate!

Why do PDF subsets such as PDF/A (for archival-quality PDF) and PDF/UA (for accessible PDF) exist?

The answer is straight-forward: to make it possible for downstream users and software to leverage the subset for some purpose - often to comply with legal requirements or some other formal guidance.

Although technically indicating a PDF’s conformance to a standard such as PDF/A (ISO 19005), PDF/UA (ISO 14289) or PDF/X (ISO 15930) is easy (it’s “just” metadata), it is vitally important that developers take care that files are never mistakenly labeled if they aren’t validated as conforming.

Beyond undermining the utility and trustworthiness of such standards, mislabelled files can expose customers - and their vendors - to expensive legal action in regulated environments.

The PDF Association works alongside impacted stakeholder communities to help resolve large-scale problematic mis-labeling of PDFs. Some examples include:

  • Reporting issues with PDF/A conformance and dual PDF/A plus PDF/UA-1 conformance in OpenOffice and LibreOffice.
  • Reporting issues to Docusign where Tagged PDFs uploaded for signing were incorrectly flagged as PDF/UA-1 compliant. The rollout of fixes started on August 15 and will be completed for all customers by mid September.

With widely available validation technologies for all PDF standards, there is no reason developers cannot detect such issues during development, well before production software starts creating incorrectly identified PDFs that will then persist, creating downstream issues.

If in doubt, leave it out!

OPF patches JHOVE 1.30.1

JHOVE is a Java-based open-source file format identification, validation and characterisation tool for digital preservation and part of the Open Preservation Foundation reference toolset. A patch release for JHOVE 1.30.1 fixes a PDF validation problem introduced in JHOVE 1.30 and thus is an important update for preservationists and archivists.

ISO TC 130 WG 2 developing PDF/X standards

The broadly-titled ISO TC 130 WG 2 “Prepress data exchange” is the main ISO working group developing PDF-based standards for the graphic arts industry. A recent interview with the Convenor resulted in the publication of some confusing information. To clarify:

  1. ISO 15930 (PDF/X) is developed by ISO TC 130 WG 2 while ISO 19005 (PDF/A) is developed by ISO TC 171 SC 2 WG 5. While many PDF subject matter experts actively work across both committees, each ISO committee also includes related industry stakeholders to ensure that international standards for PDF both meet unique industry needs and are aligned (as much as technically possible) across these independent TCs. This can be a challenging task - especially with committees working to different deadlines - so the PDF Association hosts agile forums that allow both sets of experts to collaborate together.
  2. ISO operates on a consensus basis, with member countries balloting to approve each step in the process. In 2007 Adobe turned over the core PDF specification to ISO TC 171 SC 2. Today, Adobe experts - working as equal participants alongside experts from many other vendors in vendor-neutral processes in both ISO committees - provides input to the US national position for all PDF standards.

TC 130 WG 2 is currently “defining “Best Practices” for the processing steps for PDF workflows. This work is done in cooperation with the Ghent PDF Workgroup, and the PDF Association using PDF/X-6 and PDF 2.0.”

The PDF Association maintains a detailed list of current ISO work items across all ISO committees involved in PDF.

GWG offers seminars on PDF

The Ghent Workgroup (GWG) have announced their next free webinar titled “Unlocking the benefits of PDF 2.0” for August 22, 2024 at 4:00 pm CET in their free webinar series: 

PDF 2.0 was first released in 2017. Seven years on it is still not widely used, though. Is that because it has nothing to offer to the graphics arts industry or is it because vendors are not taking advantage yet of the interesting new features of PDF2.0? Find out more about this during the webinar in which we explain what the new things are and you can judge for yourself.”

In collaboration with the Canadian Institute of Graphic Communications and Printability in Quebec, Canada,  GWG has announced a half-day seminar on “Global Standards and Practices” on PDF in graphics arts for Tuesday, September 3 2024. The seminar features 4 sessions:

  • PDF workflow tools
  • Embellishment
  • PDF preflighting
  • PDF for packaging, covering ISO 19593-1.

PDFacademicBot for August, 2024

Alenka, K.-Č. and Andreja, H. (August 2024) ‘Enhancing E-book Accessibility: Insights from User Practices and Needs Across Different Communities’, Central European Library and Information Science Review Közép-európai Könyvtár- és Információtudományi Szemle [Preprint]. https://doi.org/10.3311/celisr.37389.

Carrara, A., Nousias, S. and Borrmann, A. (January 2024) ‘Employing graph neural networks for construction drawing content recognition’, p. 10. https://mediatum.ub.tum.de/doc/1748705/vjffmc66e8g1p1nfeet4k1i3f.2024_Carrara_I3CE.pdf.

Ding, Y., Lee, J. and Han, S.C. (August 2024) ‘Deep Learning based Visually Rich Document Content Understanding: A Survey’. arXiv. https://doi.org/10.48550/arXiv.2408.01287.

Hina Tufail et al. (August 2024) ‘Advancements in Query-Based Tabular Data Retrieval: Detecting Image Data Tables and Extracting Text using Convolutional Neural Networks’, p. 11. [Preprint]  https://doi.org/10.20944/preprints202408.0108.v1.

Kumar, A. and Wang, Lucy Lu (2024) ‘Uncovering the New Accessibility Crisis in Scholarly PDFs’, in ASSETS ’24. 26th International ACM SIGACCESS Conference on Computers and Accessibility, St. John’s, NL, Canada: ACM. https://doi.org/10.1145/3663548.3675634.

Liu, F., Kang, Z. and Han, X. (2024) ‘Optimizing RAG Techniques for Automotive Industry PDF Chatbots: A Case Study with Locally Deployed Ollama Models’, arXiv.org [Preprint]. https://arxiv.org/abs/2408.05933v1

Macko, B.D. (May 2024) Enhancing Common Criteria Certificate Analysis with Semantic Segment Search. Master’s Thesis. Masaryk University. https://is.muni.cz/th/k87j7/MasterThesis.pdf.

Mathur, P. et al. (August 2024) ‘DocPilot: Copilot for Automating PDF Edit Workflows in Documents’, in Y. Cao, Y. Feng, and D. Xiong (eds) Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations). Bangkok, Thailand: Association for Computational Linguistics, pp. 232–246. https://aclanthology.org/2024.acl-demos.22

de los Pinos, A.J.C. et al. (June 2024) ‘Sustainable Development Applied to Presentation Environments for Final Degree Projects’, in D. Bienvenido-Huertas, M.L. de la Hoz-Torres, and A.J. Aguilar Aguilera (eds) Teaching Innovation in Architecture and Building Engineering: Challenges of the 21st century. Cham: Springer Nature Switzerland, pp. 553–567. https://doi.org/10.1007/978-3-031-59644-5_31.

Qasim, L. and Alisaraie, L. (August 2024) ‘ProS2Vi: a Python Tool for Visualizing Proteins Secondary Structure’. Canada: arXiv preprint. https://doi.org/10.48550/arXiv.2408.03436.

Stivala, G. et al. (2024) ‘Uncovering the Role of Support Infrastructure in Clickbait PDF Campaigns’. Euro S&P 2024: arXiv. https://doi.org/10.48550/arXiv.2408.06133.

Šola, H.M., Qureshi, F.H. and Khawaja, S. (August 2024) ‘Predicting Behaviour Patterns in Online and PDF Magazines with AI Eye-Tracking’, Behavioral Sciences, 14(8), p. 677. https://doi.org/10.3390/bs14080677.

Tan, J. and Rigger, M. (July 2024) ‘Inconsistencies in TeX-Produced Documents’, in Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis. ISSTA’24, Vienna, Austria: ACM, p. 13. [Preprint] https://doi.org/10.1145/3650212.3680370.

Varalakshmi, M.I. et al. (July 2024) ‘Detection of Malware in Pdfs Using Hybrid Algorithm’, African Journal of Biological Sciences, 6(6), pp. 7511–7521. https://doi.org/10.33472/AFJBS.6.6.2024.7510-7521.

Zeeshan Javed, S.M. and Amjad, F. (June 2024) ‘Unveiling Hidden Threats in PDFs with Hybrid Malware Classification’, in 2024 IEEE 30th International Conference on Telecommunications (ICT). 2024 IEEE 30th International Conference on Telecommunications (ICT), Amman, Jordan: IEEE, pp. 01–06. https://doi.org/10.1109/ICT62760.2024.10606032.

WordPress Cookie Notice by Real Cookie Banner