A case study in PDF forensics: The Epstein PDFs

This article details a PDF forensics case study on a small, random selection of the Epstein PDF files released by the US Department of Justice (DoJ). The tranche contains 4,085 PDF files, with an estimated 5,879 remaining unreleased. Key findings include:

A difference in PDF version reporting between forensic tools.
The presence of two incremental updates.
The discovery of a hidden (orphaned) document information dictionary revealing the software used in processing.
The DoJ avoided JPEG images to prevent metadata leakage.
Overall, the DoJ’s sanitization workflow could be improved to reduce file size and information leakage.

By Peter Wyatt
December 2025

A case study in PDF forensics: The Epstein PDFs

A difference in PDF version reporting between forensic tools.
The presence of two incremental updates.
The discovery of a hidden (orphaned) document information dictionary revealing the software used in processing.
The DoJ avoided JPEG images to prevent metadata leakage.
Overall, the DoJ’s sanitization workflow could be improved to reduce file size and information leakage.

By Peter Wyatt
December 2025

Member News

Cloud in Omnichannel Communications

February 2023, by Carsten Luedtge, Compart GmbH

A large European service company developed an infrastructure for its customer communications that can be accessed by both the self-hosted … Read more

Datalogics Adds PDF Checker to Docker Hub

January 2023, by Lindsey Schroeder, Datalogics

PDF Checker – now available on Docker Hub – is a free tool from Datalogics that helps businesses and developers … Read more

Announcing the release of iText Suite 7.2.5

January 2023, iText

In this latest release, Core will open documents faster and be more resilient to malformed PDFs. We’ve also widened its … Read more

pdfRest Launches New API Tool: Query PDF

January 2023, by Eric Shore, Datalogics

Query PDF is a REST API tool that quickly returns PDF document details, metadata, and other conditional information about the … Read more

New Version of PDF Automation Server with Authenticator Module, Workflow Conversion Nodes, Improved PDF Web Viewer

January 2023, Qoppa Software, LLC

The latest PDF Automation Server comes with a new Pluggable Authenticator Module that allows organizations to integrate its REST API … Read more

callas software releases pdfToolbox 14.1

January 2023, by Dietrich von Seggern, callas software GmbH

callas software today releases a minor update to it’s flagship pdfToolbox product line, which corrects identified problems with the pdfToolbox … Read more

Featured articles

Discover pdfa.org

Key resources

Get involved

A case study in PDF forensics: The Epstein PDFs

A case study in PDF forensics: The Epstein PDFs

Member News