Email archiving with PDF/A
The main challenge when archiving emails is the absolute menagerie of different file formats a business has to deal with.The main challenge when archiving emails is the absolute menagerie of different file formats a business has to deal with. There is the email itself, of course, but the real problem is the attachments, which can be images, scanned documents or PDF, Word or Excel files. These formats tend to be proprietary and non-standard, but converting emails and their attachments to PDF/A will ensure they remain readable for long periods of time. The format is complete: all fonts are embedded within the file, for example, as is the relevant metadata. PDF/A also ensures that a file will be displayed exactly the same on any system and forbids any dynamic content. Finally and this is the most important argument of all the ISO standard is designed for long-term archiving, guaranteeing that files saved in the PDF/A format will remain reproducible and readable for decades to come. We can consider it guaranteed that a PDF/A viewer will always be available over the span of time required for archiving.
During the conversion process, the header information within an email is extracted and added to the PDF/A files XMP metadata, allowing users to search for specific emails. The body of the email is converted to PDF/A and stored as the PDF/A file itself. The three different versions of the PDF/A standard open up a range of options which all come into play when archiving email attachments: PDF/A-1 adds any attachments to the end of an email as additional pages. PDF/A-2 allows the user to embed additional PDF/A files, meaning that attachments can be integrated as PDF/A files in their own right. The third and latest version of the standard, PDF/A-3, goes a step further: because it allows users to embed any file type, attachments can be integrated into the archive-ready file in both source and PDF/A formats. The email can also be embedded in its original format, if desired, meaning that it can then be opened in an email client to reply to the sender, for example. Under this system, it is important to preserve the relationship between archive-ready documents, emails and attachments within the PDF/A file to ensure, in other words, that this relationship is not dependent on the archive itself. This guarantees complete system neutrality, a key advantage which can avoid issues such as time-intensive system migration projects down the line. When choosing and integrating PDF/A tools, archivists should seek expert advice. One good source of information on the subject is the PDF Association. The PDF/A Competence Center is a subdivision of the Association, forming an excellent point of contact with a wide range of PDF/A experts.
Dietrich von Seggern, Business Development Manager, callas software GmbH, summarized the situation: More and more businesses are archiving emails in PDF/A format. Handling attachments is a critical part of this strategy, and the third version of the PDF/A standard allows them to be archived as both original and archive-ready files.