How to optimally prepare PDF files for variable data printing
Dietrich von Seggern from callas software explains, how companies can optimally prepare PDF files for variable data printing.Due to the continuously increasing competitive pressure, companies are looking for ways and means to address their customers and prospects in a more targeted manner. In commercial printing, too, ‘Giesskannenprinzip’ or the ‘watering-can principle’ with standardized advertising campaigns has therefore had its day and is often replaced by individualized communication via variable data printing (VDP). This requires defined digital processes that combine information to produce individual print results. Among other things, this approach has implications for PDF creation and PDF workflows in prepress. Powerful applications and integrated marketing tools that enable the integration of variable or individual data are already available, and the result is usually PDF files. However, these products must then also be able to be output on high-speed presses in an acceptable time, which is not always unproblematic.
Let's talk VDP with use-cases
In principle, individualized print communication is achieved by combining individual data - associated with the addressee of the communication - and designed templates. One example is the so-called shopping cart droppers. These are the people who add products to their shopping cart in an online store but do not complete the order. To persuade the customer or potential customer to buy the selected products, retailers can send them a postcard, for example, which should of course be addressed to them personally and perhaps include a discount code. Other applications include individualized labels or packaging, direct mailings and tickets, where the combination of variable data and designed templates almost always adds up to several thousand pages.
VDP - behind the scenes
As a rule, such print products are created on powerful machines that are now very fast. The merging of the individual data with the template can take place directly in front of the press in its Raster Image Processor (RIP) or the Digital Front End (DFE) of the print service provider. In this case, there must be a close relationship between the customer and the print service provider to ensure that the data can be processed by the latter. In today's increasingly short-term pure service relationships, however, the customer must create the individualized PDF pages himself. In this case, the print service provider receives a PDF file containing many thousands of pages. There are certain requirements for such PDF files in order to enable their high-performance processing in the DFE.
The issue
Since the PDF is not created by the print service provider itself, it has no control over the quality of the PDF file and, above all, whether it allows fast processing. If the press has to wait for the data to be prepared, the job will naturally take longer. With many individualized pages, this quickly results in considerable delays, so that the agreed price for the job may no longer be economical or even subsequent jobs may fall by the wayside.
The solution
To avoid this, the PDF Association has published a guide with practical recommendations for variable data printing. It addresses creators of VDP files and software providers whose tools are used to create PDF files that are "VDP-ready". It is important to note in this context that the recommendations do not impose any restrictions on the design of the print products. The sole aim is to achieve the desired design in such a way that high-performance print output is possible.
Caching enables fast processing of VDP jobs
A small calculation example illustrates the importance of optimizing VDP jobs: For example, a postcard is to be printed 360 times. The press used can produce three pages per second, which corresponds to 180 pages per minute. If the flyer is static, the first page is printed 360 times and the DFE has two minutes to prepare the next page, which is not a problem. However, if the flyer contains variable data, 360 different pages must be produced. This means that the DFE has only one-third of a second per page to prepare the next page, which is generally too little for complex pages.
So does this mean that complex designs with variable data cannot be printed economically? Fortunately, the answer is no, because VDP data has a special property. They generally consist of static and variable objects. Specifically, the templates are based to a large extent on static content that is identical on every page, for example the background of a flyer. The variable, or changing content can be individual text, barcodes or QR codes, or images. The figure below, for example, shows a brochure page with a static background and individual images and texts.
Static and Variable objects
If the DFE is able to distinguish between this variable and static data, it can cache all the static ones and thus store them temporarily for the next page. The DFE then only has to calculate the variable data for each page, which is much faster. The requirement is therefore to prepare PDF files in such a way that the DFE can distinguish between variable and static content and cache the latter. Repeating objects are created within PDF files as so-called XObjects. However, not every XObject is used more than once in a PDF. To decide whether an XObject should be cached, the DFE must also detect whether the XObject has dependencies on general parameters, so-called graphic state parameters, and therefore does not always look identical. Only then does caching make sense here. However, this is a challenge.
Ideally, the XObject would tell the DFE that it is independent and used multiple times. This is also possible in principle via an entry in the XObject metadata defined in the ISO standard for variable data printing as well as transactional printing, PDF/VT. Unfortunately, however, this metadata is hardly ever used in practice. Another tool defined in PDF/VT is DPart metadata, which can indicate the number of pages for a copy. If this indicates, for example, that a brochure consists of four pages, the DFE would compare the corresponding pages and look for Xobjects that repeat on the first and the fifth and the ninth page and so on. So the DFE would then know which pages might be largely identical, making it easier to make the caching decision.
Specifically, these challenges can be used to derive some recommendations for the layout of VDP data, which can be found in the PDF Association Guidelines as follows:
- Place static content before variable content
- Avoid object overlays between static and variable content
- Do not place hidden objects
Transparency and overprinting
Another complication when caching XObjects is transparency and overprinting. When a static XObject interacts with variable content on the same page through transparency and overprinting, it can no longer be cached in a meaningful way because it does not always look the same. Consequently, transparencies on a variable object interacting with static objects or vice versa are problematic. So, ideally, if transparency is used in a VDP file, it should only be used on static objects. Any overlay with variable content should be avoided wherever possible.
Fonts and images in variable data printing
Font caching can also be useful if different subsets are not used for each page. It is therefore recommended to use only one subset per font for the entire PDF file. Images that are used more than once should only be embedded once. Further recommendations are to set the image resolution rather low and to remove masked pixels. As illustrated in figure two, in some cases there are also personalized images. In this case, it makes sense to divide them into a static and a variable part, whereby any personalized part should be kept as small as possible.
More information:
PDF Association's Best Practice in Creating Print Files for Variable Data Printing (VDP)