A case study in PDF forensics: The Epstein PDFs

This article details a PDF forensics case study on a small, random selection of the Epstein PDF files released by the US Department of Justice (DoJ). The tranche contains 4,085 PDF files, with an estimated 5,879 remaining unreleased. Key findings include:

A difference in PDF version reporting between forensic tools.
The presence of two incremental updates.
The discovery of a hidden (orphaned) document information dictionary revealing the software used in processing.
The DoJ avoided JPEG images to prevent metadata leakage.
Overall, the DoJ’s sanitization workflow could be improved to reduce file size and information leakage.

By Peter Wyatt
December 2025

A case study in PDF forensics: The Epstein PDFs

A difference in PDF version reporting between forensic tools.
The presence of two incremental updates.
The discovery of a hidden (orphaned) document information dictionary revealing the software used in processing.
The DoJ avoided JPEG images to prevent metadata leakage.
Overall, the DoJ’s sanitization workflow could be improved to reduce file size and information leakage.

By Peter Wyatt
December 2025

Member News

UPDF Achieves G2 Leader Status, Joins Top 4 Global PDF Editors in Winter 2026

January 2026, Superace Software Technologies Co., Ltd.

UPDF, Superace’s all-in-one AI-powered PDF solution, has been named a G2 Leader and ranked among the Top 4 PDF Editors … Read more

Finetuned HTML derivation in the latest ngPDF release by Dual Lab

January 2026, Dual Lab sprl

Dual Lab is proud to announce the new release of the web application ngPDF. The application demonstrates the implementation of the derivation … Read more