veraPDF’s Arlington PDF Model Checker released
File type for downloadable forms, in 3 letters | Mobile devices are pain-points for forms | But, Android support for PDF is improving | veraPDF Arlington Checker released | PDF Week Prague 2024 | iPRES 2024 Great Preservation Bake-off | New PDF Association liaison on HDR with ISO TC 42 Photography | GWG to publish more educational materials | PDFacademicBot for September 2024.File type for downloadable forms, in 3 letters
The clue for 1A in the September 5 New York Times crossword puzzle was “File type for downloadable forms” in 3 letters. They refresh the site daily, so we had to take a screen-shot:
Maybe it’s just us, but we think the answer is “PDF”! 😀
And yet… mobile devices are pain-points for forms
Federal News Network: The US federal government is taking a hard look at forms workflows… and today’s mobile solutions aren’t cutting it.
"We started with fillable PDFs that can be challenging on some mobile devices if you don’t have the right software. So we’re pushing all of these toward just essentially web fillable forms."
Android support for PDF is improving
Android 15 sports a new PDFRenderer API. with numerous enhancements for PDF viewing, including with password-protected files, annotations, form editing, searching, and selection.
veraPDF Arlington Checker released
Moving on from their previous development preview announcement, The Open Preservation Foundation (OPF) has announced the release of their veraPDF Arlington Checker. This integration of the Arlington PDF Model is available as a veraPDF GUI and CLI download, a Docker image and REST API on DockerHub, and a public web demo.
iPRES 2024 Great Preservation Bake-off
As part of the iPRES 2024 Great Preservation Bake-off - “a friendly but competitive format for live demos of great tools for digital preservation” - the Open Preservation Foundation will demo its veraPDF validator software powered by the Arlington PDF Model, as mentioned above.
The free and open-source Arlington PDF Model was originally developed by the PDF Association for DARPA’s “SafeDocs” research program. It’s a comprehensive machine-readable, specification-derived data model - including data integrity relationships - for all ISO-standardized and commonly encountered PDF objects. The PDF Association maintains the resource, including resolved ISO 32000-2 PDF errata. Using software that leverages the Arlington PDF Model will help preservationists identify deviations from PDF’s specification, whether they be extensions, private data, malformations, or other issues that might otherwise remain hidden.
Liaison on HDR with ISO TC 42 Photography
The PDF Association has announced that its request for Category A liaison with the ISO committee responsible for high dynamic range (HDR) encodings (ISO TC 42 Photography) has been accepted. As a result, PDF Association members involved in our Imaging Model TWG to work on including HDR in a future version of PDF are now able to access and comment on work-in-progress documents about HDR from ISO TC 42.
As the principal SDO for PDF, the PDF Association now has ISO Category A liaisons with:
- ISO TC 171 SC 2 (for core PDF, PDF/A, and PDF/UA)
- ISO TC 130 (for PDF/X, PDF/VT, PDF/VCR and related standards)
- ISO TC 42 (for HDR)
- ISO JTC 1 SC 34 (for related document formats)
The PDF Association also maintains liaisons with other organizations, including:
- C2PA (Coalition for Content Provenance and Authenticity)
- ETSI (European Telecommunications Standards Institute)
- ICC (International Color Consortium)
- IPTC (International Press Telecommunications Council)
- Khronos Group
Members of the PDF Association can thus gain access to the very latest technical information helping shape the future of PDF.
GWG to publish more educational materials
The Ghent Workgroup (GWG) has announced via LinkedIn that they will soon publish more educational videos and posters “created by dedicated students who are eager to push the boundaries in the world of graphics“. This builds on the successful efforts of past students who produced educational infographics and short videos about PDF in the graphic arts industry.
City information available for PDF Week Prague 2024
For those visiting Prague and the Czech Republic for PDF Week Prague 2024, we have prepared an informative document to help visitors to Prague. This includes detailed information on our venue, language, currency, public transport, and some sightseeing suggestions.
PDF in the Wild for September 2024
Abdul–Samad, A. et al. (2024) ‘Comprehensive Review on Data Preservation Models and Standards in Digital Forensic’, in 2024 International Conference on Data Science and Its Applications (ICoDSA). 2024 International Conference on Data Science and Its Applications (ICoDSA), pp. 277–282. https://doi.org/10.1109/ICoDSA62899.2024.10651616.
Bank, H.S. and Herber, D.R. (August 2024) ‘CatalogBank: A Structured and Interoperable Catalog Dataset with a Semi-Automatic Annotation Tool (DocumentLabeler) for Engineering System Design’. arXiv. https://doi.org/10.48550/arXiv.2408.08238.
Bensahla, A. et al. (2024) ‘Unsupervised Extraction of Body-Text from Clinical PDF Documents’, Studies in Health Technology and Informatics, 316, pp. 214–215. https://doi.org/10.3233/SHTI240382.
Ding, Y. et al. (2024) ‘MMVQA: A Comprehensive Dataset for Investigating Multipage Multimodal Information Retrieval in PDF-based Visual Question Answering’, in Proceedings of the Thirty-ThirdInternational Joint Conference on Artificial Intelligence. Thirty-Third International Joint Conference on Artificial Intelligence {IJCAI-24}, Jeju, South Korea: International Joint Conferences on Artificial Intelligence Organization, pp. 6243–6251. https://doi.org/10.24963/ijcai.2024/690.
Lyu, Y. (2024) ‘Automated PDF Data Extraction Tool - An Interactive Platform with Python Dash’, PharmaSUG China 2024 [Preprint]. https://www.lexjansen.com/pharmasug-cn/2024/CC/Pharmasug-China-2024-CC10008_Final_Paper.pdf
Sheng, L. and Xu, S.-S. (Sept. 2024) ‘PdfTable: A Unified Toolkit for Deep Learning-Based Table Extraction’. arXiv. https://doi.org/10.48550/arXiv.2409.05125.
Spencer, R. (2024) ‘Design patterns in Digital Preservation: declarative software for digital preservationists’, in. iPRES 2024: The 20th International Conference on Digital Preservation, Ghent, Belgium, p. 17. Preprint. http://exponentialdecay.co.uk/blog/wp-content/uploads/2024/05/rspencer_iPRES2024_paper-design-patterns-declarative_20240115.docx.pdf.