This article traces the evolving business adoption of PDF as today’s dominant digital document format from the mid-90’s as detailed in my first book “From Paper to Web” (Adobe Press, 1997) until today. My latest book “PDF Expert” shares my experience working directly with the world’s most demanding companies and users over the last 25 years.
My original attraction to PDF struck me when I saw Acrobat Capture on a big screen at the AIIM Show in 1995. Dedicated to OCR since Kurzweil/Xerox in 1983, I was stunned at this brilliant breakthrough – instead of tildes for OCR errors, human-readable image snips of the suspect words appeared in the text! And the output pages looked like the original! I tested Acrobat Capture 1.0 and published an article in Imaging World called “Acrobat Capture vs. OCR: Apples & Oranges.”
Chris Hunt, Project Lead on the Capture team reached out to me and asked me to write a book expanding on my insights. When he sent Adobe Acrobat 2.0, my amazement at the new format expanded exponentially. In addition to paper-equivalence for printing and portability, PDF was superior in navigation upon release of PDF 1.0, which included text, images, pages, hypertext links, bookmarks and thumbnails. I was amazed to discover that PDF incorporates a robust Document Management architecture in Doc Info fields and Acrobat Catalog.
What was supposed to be a book about capture and OCR rapidly evolved along with my fascinated exploration of this incredible new format and resulted in simultaneous paper and web publication of “From Paper to Web.” To demonstrate the amazing universal format features of PDF, I put up a web server running Verity Search. My book may have been the first searchable PDF content on the Web when we launched on April 1, 1997. These days all browsers support PDF viewing, search and byte serving, when you click on a link in the ToC or Index, you immediately view the target page. That original PDF 1.2 document is still available.
In my experience working with thousands of companies and the most demanding PDF business users, the most commonly required PDF functionality includes:
- PDF Creation – Most users are satisfied with PDF 1.5, but support for PDF/A is often important for archival and accessibility needs. In law firms (the most demanding PDF users) with IP practices, the ability to create PDF acceptable for USPTO submission is crucial. PDF/X and PDF/E are important for those niche markets which are served by focused competitors. Some web sites required Adobe Reader or even Acrobat, but that limitation is very rare now. The adoption of PDF 2.0 and PDF/A-4 has been slow, I expect the recent release of the ISO standard 32000-2 by PDF Assoc. will accelerate market penetration.
- Document Editing – This is primarily required at the document or page level, since customers usually prefer to do their content editing in Word or Excel, especially on multipage documents where text and format elements flow over pages. Document or page editing enables the user to add, delete and reorder pages within a PDF document. Additional features called Document Assembly allow users to selectively choose pages from a source document, like Word, and place the pages into specific locations within the PDF. A related process involves combining multiple documents into a single PDF, with options to add the filenames as bookmarks. An enhancement feature renders these bookmarks as a Table of Contents in formal PDF documents.
- Redaction – Law firms were among the earliest adopters of PDF, a perfect format for an industry that lives on documents. The ability to redact sensitive information from documents is critical in law practice, as is metadata scrubbing, the other half of the process. Redaction removes what you see in the document, Metadata scrubbing removes all other sensitive information.
- Security – This has become an increasingly vital feature in PDF, one of the reasons industries may adopt 2.0 for superior encryption. The most security-sensitive customers demand documentation of a reliable Software Development Life Cycle.
- PDF Conversion – The ability to convert PDF, including PDF Image and PDF Normal, to Word and Excel documents became important as PDF entered the business market. Microsoft customers demanded a solution to work with content in PDF files. Microsoft approached ScanSoft, supplier of Textbridge OCR in Windows MODI (Microsoft Office Document Imaging) for a solution to bring PDF into Word. The result was PDF Converter for Word in 2003. ScanSoft soon added a PDF Printer and a basic PDF Editor to offer an early alternative to Acrobat. This product incorporated the OmniPage OCR engine which offered both better text recognition of poor quality image, and much better format recognition. Compared to roughly matching Acrobat in PDF creation and editing functions, PDF conversion to MS Office offered better performance.
- Forms Conversion – Business users often need to fill a form which they receive in static format, lacking fillable form fields. The original PDF Converter included FormTyper which automatically recognizes and creates data entry fields and checkboxes to create fillable PDF forms, providing significant advantages in convenience and labor savings.
- Ease of Use – PDF users in business have been using Adobe Acrobat since the beginning. Early on it was not uncommon for a customer to ask: “Why is Adobe letting you do this?” A quick explanation of Adobe’s initial release of PDF as an open standard put them at ease, but everyone worries about the difficulty of moving to a new product. The adoption of the “MS Office-style Ribbon Interface” largely solves this problem, because all business users know their way around Word at least. And they also know what they need to do with PDF. By putting all the PDF tools in a familiar UI, adoption of the new solution is fast and easy, usually requiring no training beyond the initial familiarization demo.
That initial sight of Acrobat Capture led to a career in PDF. It is rewarding that all the great promise I wrote about in “From Paper to Web” 25 years ago has come to fruition. In “PDF Expert,” I again focus on the rich functionality built into PDF architecture and demonstrate how to take full advantage of that potential the way power users employ PDF to create, edit, distribute and protect valuable business content.
Tony McKinley was Lead SE for PDF for ScanSoft/Nuance/Kofax for over 17 years. He has recently published “PDF Expert – Master PDF and OCR” to share his lessons learned to empower business users of PDF.