A case study in PDF forensics: The Epstein PDFs
This article details a PDF forensics case study on a small, random selection of the Epstein PDF files released by the US Department of Justice (DoJ). The tranche contains 4,085 PDF files, with an estimated 5,879 remaining unreleased. Key findings include:
- A difference in PDF version reporting between forensic tools.
- The presence of two incremental updates.
- The discovery of a hidden (orphaned) document information dictionary revealing the software used in processing.
- The DoJ avoided JPEG images to prevent metadata leakage.
- Overall, the DoJ’s sanitization workflow could be improved to reduce file size and information leakage.
A case study in PDF forensics: The Epstein PDFs
This article details a PDF forensics case study on a small, random selection of the Epstein PDF files released by the US Department of Justice (DoJ). The tranche contains 4,085 PDF files, with an estimated 5,879 remaining unreleased. Key findings include:
- A difference in PDF version reporting between forensic tools.
- The presence of two incremental updates.
- The discovery of a hidden (orphaned) document information dictionary revealing the software used in processing.
- The DoJ avoided JPEG images to prevent metadata leakage.
- Overall, the DoJ’s sanitization workflow could be improved to reduce file size and information leakage.
Liaison with TC 42 brings HDR closer
September 2024 by PDF Association staff
Article, For members only
PDF Association members – especially those working on PDF’s imaging model – can now access TC 42 WG 23 resources directly from pdfa.org.

Visit PDF Association staff's profile.

























