PDF Association logo

Discover pdfa.org

Key resources

Get involved

How do you find the right PDF technology vendor?
Use the Solution Agent to ask the entire PDF communuity!
The PDF Association celebrates its members’ public statements
of support
for ISO-standardized PDF technology.

Member Area


A case study in PDF forensics: The Epstein PDFs

This article details a PDF forensics case study on a small, random selection of the Epstein PDF files released by the US Department of Justice (DoJ). The tranche contains 4,085 PDF files, with an estimated 5,879 remaining unreleased. Key findings include:

  • A difference in PDF version reporting between forensic tools.
  • The presence of two incremental updates.
  • The discovery of a hidden (orphaned) document information dictionary revealing the software used in processing.
  • The DoJ avoided JPEG images to prevent metadata leakage.
  • Overall, the DoJ’s sanitization workflow could be improved to reduce file size and information leakage.
Peter Wyatt

By Peter Wyatt
December 2025

A case study in PDF forensics: The Epstein PDFs

This article details a PDF forensics case study on a small, random selection of the Epstein PDF files released by the US Department of Justice (DoJ). The tranche contains 4,085 PDF files, with an estimated 5,879 remaining unreleased. Key findings include:

  • A difference in PDF version reporting between forensic tools.
  • The presence of two incremental updates.
  • The discovery of a hidden (orphaned) document information dictionary revealing the software used in processing.
  • The DoJ avoided JPEG images to prevent metadata leakage.
  • Overall, the DoJ’s sanitization workflow could be improved to reduce file size and information leakage.
Peter Wyatt

By Peter Wyatt
December 2025

Tagging Images in PDF/UA-1 and PDF/UA-2

July 2025 by Paul Rayius (Allyant)
Article


Experience since PDF/UA-1 was published has revealed that some rules were too prescriptive. In some situations the semantic significance of an image is due to its representation of text, and … Read more

Photo of Paul Rayius
Visit Paul Rayius's profile.

June 2025 by Patrick Gallot (Datalogics)
Patrick Gallot
Visit Patrick Gallot‘s profile.

PDF has long relied on Flate for general-purpose data compression. Today, Brotli has emerged as a promising new approach that … Read more

Article

June 2025 by Patrick Gallot (Datalogics)
Patrick Gallot
Visit Patrick Gallot‘s profile.

This article explains the options in allowing, limiting, and/or preventing data scraping from occurring when AI and text data mining … Read more

Article

June 2025 by PDF Association staff
PDF Association logo
Visit PDF Association staff‘s profile.

In support of the International Archives Week theme and its four subthemes, the PDF Association is announcing its forthcoming Best … Read more

Article

May 2025 by PDF Association staff
PDF Association logo
Visit PDF Association staff‘s profile.

Announcing the Monday keynote speaker for PDF Days Europe 2025, the new poster sessions and Peter Wyatt’s PDF Workshop!

Article

April 2025 by Duff Johnson
Duff Johnson
Visit Duff Johnson‘s profile.

What does it take to build the organizational skills and resources necessary to succeed in ensuring PDF documents are accessible?

Article

March 2025 by Robin Watts (Artifex Software Inc)
Visit Robin Watts‘s profile.

Brotli compression is coming soon to PDF 2.0, enabling smaller file sizes. Sample PDF files are available to kickstart development!

Article

March 2025 by Erin Dempsey (TWAIN Working Group)
Visit Erin Dempsey‘s profile.

This article explores how ISVs can use PDF/R to improve imaging application software’s efficiency, reduce development complexity, and enhance customer … Read more

Article

February 2025 by Peter Wyatt
Peter Wyatt
Visit Peter Wyatt‘s profile.

Google’s Jpegli promises smaller JPEG images (and so, smaller PDF files) with no change to reader software! Check it out … Read more

Article

January 2025 by Peter Wyatt
Peter Wyatt
Visit Peter Wyatt‘s profile.

What’s the smallest possible fully compliant and validatable PDF file? We set out to answer the question.

Article

January 2025 by Peter Wyatt
Peter Wyatt
Visit Peter Wyatt‘s profile.

Triggered by our own experience in preparing the cheat sheet collection, this article explores the capabilities of PDF collections and … Read more

Article

Member News

WordPress Cookie Notice by Real Cookie Banner