Vertical streams of code.

SafeDocs: DARPA Does PDF

Duff Johnson, Peter Wyatt // June 13, 2019

PDF Association news

Print Friendly, PDF & Email

PDF Association awarded DoD contract to advise on research into enhancing trust in digital content and streams.

Last year the PDF Association reported that the Defense Advanced Research Projects Agency (DARPA), a unit of the US Department of Defense (DoD), was launching the DARPA Safe Documents (SafeDocs) program, a fundamental research project to develop novel parser methodologies for ensuring safety in digital content.

Although funded by the DoD, DARPA's SafeDocs program is about cyber security in general, and isn’t centered on specifically military needs. The program is intended to generate technology that is attractive to industry developers. In the words of DARPA Program Manager, Dr. Sergey Bratus:

“The parser construction kits developed by SafeDocs will be usable by industry programmers who understand the syntax of electronic data formats but lack the theoretical background in verified programming.” (source)

SafeDocs is not PDF-specific. PDF was selected due to its significance in its own right, and because it commonly uses many popular formats such as JPEG, TrueType and JavaScript.

The PDF Association in SafeDocs

Recognizing the potential significance of this program to PDF users, the PDF Association’s Board of Directors decided to actively participate in the project.

Led by the ISO Project Leaders for PDF (ISO 32000) Duff Johnson and Peter Wyatt, the PDF Association’s involvement will set appropriate challenges, ensure meaningful objectives and provide vendor-neutral technical support and format guidance to the cyber security researchers.

“This is an exciting project with many potential outcomes for industry,” said Peter Wyatt, the PDF Association’s Principal Scientist and PDF Principal Investigator on SafeDocs. “We believe that for those with significant PDF technologies, seeding DARPA’s program with insights from major players will help move content security and trust well beyond the current state of the art,” Wyatt said.

DARPA's SafeDocs program begins with the collection and analysis of very large (1010 files) and representative corpora, and understanding diverse industry implementations. An early step will be outreach to industry to help inform ground truth, environmental considerations and other technical and marketplace dynamics relevant to questions of security and trust in handling content.

Benefits to industry

For its members, the PDF Association will establish a SafeDocs Technical Working Group (TWG) as a platform for discussion about DARPA's SafeDocs program within the PDF industry.

The program's outcomes will be open source software (OSS) and, potentially, a defined “safe” subset of PDF offering the following advantages:

  • Innovations in verified programming for more robust handling of input data.
  • Reduced vulnerabilities in PDF and nested formats
  • Improved end user experience
  • Increased trust in rich features
  • Potentially, intermediate products such as research, corpora and domain-specific languages (DSLs)
  • Clarifications and enhancements to core specifications

The award, Contract No. HR001119C0079, as posted on FBO.gov.

More about SafeDocs

More information about DARPA's SafeDocs program.

This research was developed with funding from the Defense Advanced Research Projects Agency (DARPA).

The views, opinions and/or findings expressed are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.


ABOUT THE AUTHORS

Duff Johnson
Duff Johnson

As CEO of the PDF Association and as an ISO Project Leader, Duff coordinates industry activities, represents industry stakeholders in a variety of settings and promotes the advancement and adoption of PDF technology worldwide.


Peter Wyatt
Peter Wyatt

Peter Wyatt is the PDF Association’s CTO and an independent technology consultant with deep file format and parsing expertise, who is a developer and researcher actively working on PDF technologies for more than 20 years. He is Project co-Leader of ISO 32000 (the core PDF standard), co-Chairs the PDF Association PDF TWG and is the PDF Association’s Principal Scientist leading …

ABOUT THE AUTHORS

Duff Johnson

Duff Johnson

As CEO of the PDF Association and as an ISO Project Leader, Duff coordinates industry activities, represents industry stakeholders in a …


Peter Wyatt

Peter Wyatt

Peter Wyatt is the PDF Association’s CTO and an independent technology consultant with deep file format and parsing expertise, who …