View the PDF Days Europe 2025 agenda
This presentaton is part of PDF Days Europe 2025.
Register now!
View our terms and conditions.
PDF Forensics and the Metadata conundrum
Metadata? What metadata?
Excerpt: Metadata is not the only consideration when doing forensic examinations on a document. Metadata is, however, one of the quickest and easiest set of items to check when determining if a document warrants a further look. In some cases it can, at the very least, be enough to show us that a document has been changed since its original save. However, depending on your definition of metadata, the PDF standard does not require its use. Some implementations output no metadata, not even creation and modi … Read moreAbout the presenter(s)
Cherie Ekholm is a Product Strategy Lead at Verisk. She began working on PDF standards in 2007 while working at Microsoft in the Office business unit. Cherie represented Microsoft in … Read more
Description
Metadata is not the only consideration when doing forensic examinations on a document. Metadata is, however, one of the quickest and easiest set of items to check when determining if a document warrants a further look. In some cases it can, at the very least, be enough to show us that a document has been changed since its original save.
However, depending on your definition of metadata, the PDF standard does not require its use. Some implementations output no metadata, not even creation and modification dates or producer and creator information. Many implementations output only the PDF Version information, but not in a reliable manner.
Using tools like Apache Tika, exiftool, and various others, a forensic examiner may be able to extract metadata and metadata-adjacent information hiding in the nooks and crannies of PDF files. Yet there are still a significant number of files in the wild that manage to be, for all practical use, completely lacking in actionable metadata.
This presentation will look at what metadata is, why it’s important for forensic work, and touch on some of the best tools to use to find any metadata in a document. Finally, we’ll talk about how the PDF standards – and implementers of those standards – can better support its use.