PDF Association logo

Discover pdfa.org

Key resources

Get involved

How do you find the right PDF technology vendor?
Use the Solution Agent to ask the entire PDF communuity!
The PDF Association celebrates its members’ public statements
of support
for ISO-standardized PDF technology.

Member Area

Server

Packaging email archives using PDF

EA-PDF establishes high-level requirements for using PDF technology to package email for long-term preservation.
About the author: The PDF Association staff delivers a vendor-neutral platform for PDF’s stakeholders, facilitating the development of open specifications and ISO standards for PDF technology. Staff members include: Alexandra Oettler (Editor), Betsy Fanning … Read more
PDF Association staff

PDF Association staff
February 11, 2021

Article


Print Friendly, PDF & Email

Archiving email isn't easy or obvious. Commonly, solutions are vendor-specific and email clients are required; not an ideal solution for static records.

In 2019 the University of Illinois was awarded a grant by the Andrew W. Mellon Foundation to develop conversion criteria and requirements for archiving email into PDF containers.  The final report, “A Specification for Using PDF to Package and Represent Email”, is now available from the University of Illinois IDEALS Repository.

The report detailing the EA-PDF concept establishes high-level functional requirements for using ISO 32000 (PDF) technology as a model for packaging email for long-term preservation purposes. These requirements detail desirable functionality reflecting considerable input from stakeholders in digital preservation, government, education and industry communities.

PDF’s ubiquity and acceptance, rich capabilities and open, well-documented specification is already supported by a global ecosystem of developers. PDF facilitates redaction, includes advanced digital signature technology, XMP metadata, semantic tagging, associated files, rich media, 3D and many other technologies that make it highly effective for many digital content archiving applications. PDF's magic lies in its reliability and interoperability, so facilitating interoperability when using (and truly leveraging) PDF specifically for email archives turns out to be a reasonable application of the technology.

Conceptually, EA-PDF is no more complex than the underlying source email, but represents that complexity in a formally-defined manner, within the structures of the PDF container. MBOX, EML, and other formats are less well-defined formats than families of formats defined more by client implementations than by authoritative specifications. PDF provides a means to represent these implementations in a normalized packaging model, regardless of the underlying source.

The EA-PDF concept integrates the capture of EML or MBOX content with PDF as a packaging, representation and distribution model for individual emails up to complete mailboxes. Leveling the email archiving problem into the EA-PDF framework makes “A Specification for Using PDF to Package and Represent Email” a thought-provoking take on leveraging the unique power of PDF to cut this Gordian knot.

WordPress Cookie Notice by Real Cookie Banner