Today, more and more transactions are being handled online across a broad range of categories. Transactions can be Business-to-Consumer (B2C) and Business-to-Business (B2B), local and international, goods and services, and can be settled using credit cards, bank transfers or peer to peer payment networks.
While different approaches have been deployed to ensure the safety of data transmission and to verify the identity of the parties, very little has been done to ensure that the transactions are memorialized in a secure and reliable manner.
Memorializing the transaction in a reliable way becomes critical anytime that a transaction needs to be revisited, which can happen for many reasons, including audits, disputes and others. In these situations, it is imperative to have the ability to retrieve the transaction in a human readable form, for close inspection.
Most eCommerce systems in common use do not provide this guarantee, with the potential to cause legal and financial problems when transactions have to be examined.
By far, the most common method to memorialize transactions is to store the transaction data in RDBMS systems, in database records spread across multiple tables. The transaction is broken up into its components ((i.e. item numbers, quantities, cost per item, line items, etc) and then stored as database records that can be spread across multiple tables.
The human readable / visual representation of the transaction is usually not stored at all.
If the transaction needs to be examined at any point in the future, the data is retrieved from the database and a human readable visual representation of the transaction is recreated either in printed form or for display in a browser.
Even when storing simple transactions, and more so for complex transactions, a single transaction will be stored as multiple records in multiple tables in a database. For instance, a simple invoice might be composed of a main record in an invoice table and multiple records in a line items table.
The transaction data then might have a complex structure and so recreating the human readable version of the transaction is not a trivial matter.
This approach generally works because there is a level of trust in the party that is storing the transaction, that the data in the database records will remain unaltered and that the method used to convert the data to a human readable form will remain the same as when the transaction was performed.
Neither of these conditions may be true or may not remain true over time. When either condition fails, it will not be possible to reconstruct the human readable form of the transaction accurately. There are two main reasons why this assumption might fail:
Once stored in a database, transaction data is supposed to remain unchanged. However, there are many reasons why this may not be the case, including data migration, defects in the eCommerce system and even manual intervention by IT staff or hackers.
Any time that transaction data is modified, it creates the potential that the transaction might be changed in meaningful ways.
To compound the problem, database systems generally do not keep a full audit record of all transaction data, so in most cases where the data is modified, there is no way to tell that the data has been modified at all, much less to verify that the data matches the original transaction data.
Some scenarios where transaction data can be modified:
Because transaction data is complex, the conversion to a human readable form is not a trivial process. The code that converts the data must gather all the different parts of the transaction, organize them and then create a visual representation that makes sense to a human.
Over time, as eCommerce systems are upgraded, the data that is stored with transactions will evolve, and this conversion process needs to evolve with it. As the data and the conversion process are modified, the converted results will also change, to a point where it might not be possible to reconstruct the same visual representation of the transaction at some point in the future.
Even when the visual representation might be equivalent to the original transaction, and look similar, accumulated changes over time might introduce subtle changes in the interpretation of the transaction until at one point they might make a material difference.
This problem is compounded when a company replaces their eCommerce system with a different system. Not only does the data have to be transformed to fit the schema of the new system, but the conversion methods in the two systems will be very different, resulting in differences in the human readable forms as well.
We propose that by saving a visual representation of the transaction, created at the time that the transaction is made, these issues will be resolved. The natural format to store this representation would be the PDF format, for a number of reasons. PDF is the de facto universal electronic document format, it is used and accepted by anyone that uses electronic documents, and provides features for long term archiving and document integrity.
By capturing the visual representation at the time that the transaction is processed, it is guaranteed that the data used in creating the document is current and valid and the visual representation of the transaction matches the expectations of all the parties involved in the transaction.
Once capture, the PDF document should be stored separately from the transaction data records, preferably in a system designed to store documents, such as a document or content management system. Once stored, the document can carry a transaction id or similar reference to be able to connect to the transaction data records.
Some factors to consider when using this approach, and specifically when using the PDF format:
On foreground transactions where there are one or more humans actively involved, there will always be a human readable version of the transaction that is used through the transaction process. For instance, when a shopper is purchasing products online, they will see their cart with the items before checkout and they will see a confirmation screen after the transaction is committed. In such transactions, the confirmation screen (or equivalent) should be saved as the human readable version of the transaction.
On background (automated) transactions, the parties involved should agree on a specific visual representation when the automated processes are put in place, and then produce the views for every transaction at the time that each transaction is committed. On any changes to the view, the parties involved need to approve the new view before it is put into place.
The PDF format describes a related sub-format called PDF/A, that is specifically intended for long term archival of PDF documents.
PDF/A compliant PDF files are still valid PDF files but have additional requirements to make sure that the content can be rendered correctly at any time in the future, even on different systems. These requirements include embedding all fonts in the PDF document, strict definition of the colors used in the document and others to remove all dependencies on the environment that the document may be opened in.
Storing transactions using the PDF/A format would ensure exact reproducibility anytime in the foreseeable future.
In addition to using the PDF/A sub-format to store the transaction view, the PDF documents should also include a digital signature that includes a timestamp from a certified timestamp server.
The purpose of a digital signature in this context is not so much to positively identify the signer of the document, but rather to ensure that there are no modifications done to the document after its creation. Digital signatures in PDF documents can include a timestamp that will certify the time and date that the digital signature was applied.
By applying a digital signature at the time that the document is created, any changes made to the document henceforth would invalidate the signature, thereby protecting the original document against any modifications.
The embedded digital signature can and should also include a timestamp from a certified timestamp server. The timestamp serves two purposes:
Today PDF documents are widely used for statements and legal contracts. We suggest expanding the use of PDF documents to keep electronic receipts of all important transactions. Simply storing scattered data in a database is an unreliable solution for long term archiving. Visual documents that are locked and approved by all parties can provide safe immovable records.
Qoppa Software® is located in Atlanta and specializes in the development of high-end libraries and applications to work with PDF documents. Qoppa Software offers a suite of products that cover every aspect of PDF processes and integrate seamlessly into document workflows.Our products are carefully designed and developed to provide the…Read more
Gerald obtained his Master in Computer Engineering from Georgia Institute of Technology in 1990. He was among the first 10 graduates of this new Computer Engineering program. After graduating, Gerald consulted several years for the multimedia industry. One of his achievements during his consulting years, was to design the first video on demand system for Blockbuster in 1995. In 1998, … Read more