PDF Association logo

Discover pdfa.org

Key resources

Get involved

How do you find the right PDF technology vendor?
Use the Solution Agent to ask the entire PDF communuity!
The PDF Association celebrates its members’ public statements
of support
for ISO-standardized PDF technology.

Member Area

Want to make your PDFs 20% smaller for free?

Guust Ysebie

About Guust Ysebie

After 30 years of Deflate, PDFs are finally upgrading. Brotli will soon enter the PDF spec, delivering 15–25% smaller files with zero quality loss, using web-proven compression.

January 20, 2026
Brotli Compression in iText
Want to make your PDFs 20% smaller for free?
Brotli Compression in iText

After 30 years of Deflate, PDFs are finally upgrading. Brotli will soon enter the PDF spec, delivering 15–25% smaller files with zero quality loss, using web-proven compression.

January 20, 2026

Guust Ysebie

About Guust Ysebie


For nearly three decades; or November 1996 to be exact, PDFs have relied on Deflate—the same compression algorithm that powers your ZIP files. Meanwhile, the web moved on. In 2015, Google introduced Brotli, a compression algorithm so efficient it now powers 95% of internet traffic. Websites got faster. Downloads got smaller. CDNs got cheaper.

Now PDFs are getting the same upgrade.

The PDF Association is bringing this battle-tested web compression technology into the PDF specification itself. After a decade of Brotli proving its worth across billions of web requests daily, it's now getting ready to make it's introduction into ISO 32000.

With iText, we can help drive widespread adoption with a production-ready Brotli encoder and decoder for the PDF ecosystem. The result? 15-25% smaller files with zero quality loss, using the same algorithm trusted by Google, Cloudflare, and every major CDN.

Why PDF compression has struggled to evolve

PDF compression has been stuck in 1996 for a good reason: backward compatibility is sacred. The PDF Association operates under a strict principle—any new feature must work seamlessly with existing readers, or it risks fragmenting the ecosystem. Adding a new compression algorithm isn't just a technical change; it's a breaking change that could render documents unreadable in older software. This creates a high barrier for innovation.

Beyond compatibility concerns, there are other practical challenges. The PDF specification moves slowly by design—it's an ISO standard that requires consensus by hundreds of stakeholders. Compression algorithms must be royalty-free (ruling out patented options), widely supported across platforms, and battle-tested in production.

Finally, the ecosystem is conservative: enterprises and governments rely on PDFs for archival and legal documents that must remain accessible for decades, making any breaking change a risk that needs extraordinary justification.

Encoding and decoding: Technical implementation

To get Brotli compression working within the iText SDK, we need to solve two problems: reading documents, and also writing them.

Let's start with the easiest one; reading documents.

Decoding: Advanced plumbing work

First of all, let's look at how the content of a page is stored within a PDF. We can demonstrate this with just the classic "Hello World" text example.

The following PDF syntax simply displays the text "Hello World!" on a page:

5 0 obj                 % Unique identifier to reference this content from other places within the PDF
<</Length 49>>stream    % Meta data for the stream object. Here it contains a Length value to indicate how many bytes are there after the `stream` keyword.
q                       % the actual content
BT
/F1 12 Tf
37 788.33 Td
(Hello World!)Tj
ET
Q
endstream               % Indicates the end of the stream object
endobj                  % Indicates the end of the referenceable object

So, if we need to render or do anything else with the content, it would look like the following:

|------------------------|
| Get stream based on id |
|------------------------|
           ||
           \/
|------------------------|
|      Read content      |
|------------------------|
           ||
           \/
|------------------------|
|    Render/Do stuff     |
|    with the content    |
|------------------------|

Okay, so now we have a high-level view how PDF processors handle the low-level processing of those stream objects, we can dive a little deeper!

Let's take a look at the following PDF stream object where the content is encoded using the Deflate algorithm.

5 0 obj
<</Filter/FlateDecode/Length 36>>stream                  % The meta data now now includes `Filter`
xœmÍÂ0„ïûëM/1?Æl®‚âUømI)Íûºm¢...            % Reduced for clarity
endstream
endobj

First of all, we notice there is an additional Key Filter with a value of FlateDecode in the metadata.
This can be interpreted the following way: "The content of this stream object is only usable after its FlateDecode filter is applied".

So how does this change our working implementation?

|------------------------|
| Get stream based on id |
|------------------------|
           ||
           \/
|------------------------|
|      Read content      |
|------------------------|
           ||
           \/
|------------------------|
|         Decode         |
|     based on Filter    |
|------------------------|
           ||
           \/
|------------------------|
|    Render/Do stuff     |
|    with the content    |
|------------------------|

We can now see we require an operation on the content before it's usable. The PDF specification already provides a variety of ways to write the content of the PDF streams.

Filter name Description
ASCIIHexDecode Decodes ASCII hexadecimal data to binary.
ASCII85Decode Decodes ASCII base-85 data to binary.
LZWDecode Decompresses data using LZW compression.
FlateDecode Decompresses data using zlib/deflate compression.
RunLengthDecode Decompresses data using run-length encoding.
CCITTFaxDecode Decompresses CCITT fax-encoded monochrome images.
JBIG2Decode Decompresses JBIG2-encoded monochrome image data.
DCTDecode Decompresses JPEG DCT-based image data.
JPXDecode Decompresses JPEG 2000 wavelet-based image data.
Crypt Decrypts data encrypted by a security handler.

So the idea for Brotli is to simply add another Filter implementation.What we need to get it working into iText is actually pretty minimal:

  1. Get the decoding implementation from Google's repository.
  2. Write some plumbing code to call it from iText
  3. Hook up the plumbing code to the BrotliDecode filter

For the first step we simply embedded Google's reference Java Brotli decoder straight from their official repository into our kernel module.

Why embed the decoder?

By embedding Google's reference implementation directly, we guarantee:

  • Zero dependency hell: No version conflicts with other libraries
  • Consistent behavior: Same decoder on all platforms
  • Long-term stability: We control the code, even if upstream changes
  • Automatically generate C# version: Using our porting mechanism we can have a C# implementation

The plumbing implementation lives in BrotliFilter.java, which plugs into iText's existing filter pipeline:

public class BrotliFilter extends MemoryLimitsAwareFilter {
    @Override
    public byte[] decode(byte[] b, PdfName filterName, PdfObject decodeParams,
            PdfDictionary streamDictionary) {
        try {
            final byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];
            final ByteArrayInputStream input = new ByteArrayInputStream(b);
            final ByteArrayOutputStream output = enableMemoryLimitsAwareHandler(streamDictionary);
            final BrotliInputStream brotliInput = new BrotliInputStream(input);
            int len;
            while ((len = brotliInput.read(buffer, 0, buffer.length)) > 0) {
                output.write(buffer, 0, len);
            }
            brotliInput.close();
            return output.toByteArray();
        } catch (IOException e) {
            throw new PdfException(KernelExceptionMessageConstant.FAILED_TO_DECODE_BROTLI_STREAM, e);
        }
    }
}

Let's break down what's happening in this implementation:

  • Memory Safety First: The filter extends MemoryLimitsAwareFilter, which protects against decompression
    bombs—malicious PDFs that expand into gigabytes of data when decompressed. This is critical for production systems.
  • Wrapped Input Stream: The compressed bytes b are wrapped in a ByteArrayInputStream, which is then passed to
    Google's BrotliInputStream. This is where the magic happens—BrotliInputStream handles all the heavy lifting of Brotli decompression.

As you can see, writing the plumbing code is pretty easy because of iText's architecture.

The last thing to do is to ensure iText knows which implementation to associate with the /BrotliDecode filter.

This is also pretty trivial. The filter is registered automatically in FilterHandlers.java alongside /FlateDecode and the other standard PDF
filters:

public final class FilterHandlers {
    private static final Map<PdfName, IFilterHandler> defaults;

    static {
        Map<PdfName, IFilterHandler> map = new HashMap<>();

        map.put(PdfName.FlateDecode, new FlateDecodeFilter());
        map.put(PdfName.Fl, new FlateDecodeFilter());
        //other implementations removed for clarity

        // we add our implementation
        map.put(PdfName.BrotliDecode, new BrotliFilter());

        defaults = Collections.unmodifiableMap(map);
    }
}

That's it. From this point on, any PDF with /BrotliDecode streams just works. No configuration needed.

Now we could have stopped here—our SDK could process Brotli-compressed PDFs from other sources. But reading isn't enough. To truly bring Brotli to the PDF ecosystem, we needed to let developers create these smaller files. That meant solving the encoding problem.

And encoding turned out to be significantly more complex than decoding.

Encoding: a separate module for compression

The problem: iText’s compression was hardcoded

Before Brotli, iText only supported two compression modes for PDF streams:

  1. Flate compression
  2. No compression

This logic was baked directly into the stream-writing code—there was no abstraction, no plugin point. If you wanted to use a different compression algorithm, you were out of luck.

To support Brotli (and future algorithms), we needed to introduce a new abstraction layer: IStreamCompressionStrategy.

public interface IStreamCompressionStrategy {
   /**
    * Gets the PDF filter name that identifies this compression algorithm.
    *
    * @return the PDF name representing the compression filter
    */
   PdfName getFilterName();

   /**
    * Gets the decode parameters required for decompressing the stream.
    * <p>
    * Decode parameters provide additional information needed to correctly
    * decompress the stream data.
    *
    * @return the decode parameters as a PDF object, or {@code null} if not needed
    */
   PdfObject getDecodeParams();

   /**
    * Creates a new output stream that wraps the original stream and applies compression.
    * @param original the original output stream to wrap
    * @param stream the PDF stream being compressed (may be used for context or configuration)
    *
    * @return a new output stream that performs compression
    */
    OutputStream createNewOutputStream(OutputStream original, PdfStream stream);
}

This interface decouples compression logic from iText's core PDF writing machinery. Now, instead of hardcoding Flate everywhere, we can inject different strategies at runtime. To inject the required strategy we make use of the DiContainer. You can find more information about it here: Adding Dependency Injection to the PdfDocument class.

From now on when iText needs to compress a stream, it asks the DiContainer in the PdfDocument: "Do you have an IStreamCompressionStrategy?"

  • If yes: Use the registered strategy (Brotli in this case)
  • If no: Fall back to the default Flate compression

This design gives us:

  • Zero coupling: iText Core no longer cares about the algorithm used
  • Opt-in behavior: You only pay the cost if you use it
  • Future-proof: New algorithms just implement the interface

The Second Problem: No Pure Java Encoder

Here's where things got tricky. While Google's Brotli decoder has a pure Java implementation (which we embedded for reading), the official Brotli encoder is C++ only. To use it from Java, you need:

  • JNI bindings to call native code from Java
  • Platform-specific native libraries (.dll for Windows, .so for Linux, .dylib for macOS)
  • Build infrastructure to compile and ship these libraries for every platform

For a heavily-used library like iText, shipping native binaries is a non-starter:

  • Deployment complexity: Users need to manage native libraries across platforms
  • Security concerns: Native code introduces attack surfaces
  • Build maintenance: We'd need to compile for Windows x64, Linux ARM, macOS Silicon, etc.
  • Version conflicts: What if another library ships a different Brotli version?

We needed a solution that handled this complexity outside iText's core.

That solution is a separate Maven module (brotli-compressor) that you add as an optional dependency. This
module contains:

  • BrotliStreamCompressionStrategy: Implementation of IStreamCompressionStrategy
  • brotli4j dependency: A third-party library that wraps Google's C++ encoder with JNI

Here's what BrotliStreamCompressionStrategy looks like:

public class BrotliStreamCompressionStrategy implements IStreamCompressionStrategy {

    @Override
    public OutputStream createNewOutputStream(OutputStream original, PdfStream stream) {
        int compressionLevel = convertCompressionLevel(stream.getCompressionLevel());
        Encoder.Parameters params = Encoder.Parameters.create(compressionLevel);
        try {
            return new BrotliOutputStream(original, params);
        } catch (IOException e) {
            throw new PdfException(KernelExceptionMessageConstant.CANNOT_WRITE_TO_PDF_STREAM, e);
        }
    }

    @Override
    public PdfName getFilterName() {
        return PdfName.BrotliDecode; // This goes into the /Filter entry
    }
}

The native wrapper: brotli4j

Instead of writing JNI bindings ourselves, we rely on brotli4j—a mature, well-tested library that:

  • Wraps Google's official C++ Brotli encoder/decoder
  • Ships pre-compiled native libraries for all major platforms (Windows x64/ARM, Linux x64/ARM, macOS Intel/Silicon)
  • Automatically extracts the correct native library at runtime (no manual setup)
  • Is actively maintained and widely used (powers projects like Netty, OkHttp)

By delegating to brotli4j, we get production-grade native bindings without maintaining our own JNI layer.

Why keep encoding separate?

You might ask: "Why not bundle brotli4j in the kernel module like you did with the decoder?"

Great question. Here's the reasoning:

Aspect Decoder (in kernel) Encoder (separate module)
Necessity Required to read Brotli PDFs Optional—only for writing
Dependencies Pure Java (Google's decoder) Native code (brotli4j with JNI)
Size impact ~300KB of Java code ~2MB of native libraries
Use frequency Every user needs to read PDFs Most users stick with Flate
Backward compat No breaking changes Opt-in feature

By keeping the encoder separate, we give users choice: add brotli-compressor if you need 20% smaller files, or stick with the default if native dependencies are a concern.

Putting it all together: Full example

Here's what it looks like to create a Brotli-compressed PDF:

First of all add the required dependencies. Notice you have to add iText's artifactory because of the experimental nature of the code, and so users don't accidentally enable it.

<repositories>
  <repository>
    <id>itext-releases</id>
    <name>iText Repository - releases</name>
    <url>https://repo.itextsupport.com/releases</url>
  </repository>
</repositories>

<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>brotli-compressor</artifactId>
<version>{itext.version.bigger.then.9.5.0}</version>
</dependency>
public static void main() {

// 1. Register the compression strategy
   DocumentProperties properties = new DocumentProperties();
   properties.registerDependency(IStreamCompressionStrategy.class, new BrotliStreamCompressionStrategy());
// 2. Create your PDF as normal
   PdfWriter writer = new PdfWriter("output.pdf");
   PdfDocument pdf = new PdfDocument(writer, properties);

// Everything from here on uses Brotli automatically
   Document doc = new Document(pdf);
   doc.add(new Paragraph("This text will be Brotli-compressed!"));
   doc.add(new Image(ImageDataFactory.create("chart.png")));
   doc.close();

}

When you open output.pdf in a text editor, you'll see some entries looking like this:

5 0 obj
<</Filter/BrotliDecode/Length 847>>stream
[binary Brotli-compressed data]
endstream
endobj

The PDF now uses /BrotliDecode instead of /FlateDecode, and the file is 15-25% smaller—with zero changes to your document-building code.

The catch: Compatibility isn't universal (yet)

Here's the honest truth: Brotli-compressed PDFs won't open in Adobe Acrobat Reader today. They won't render in your browser's built-in PDF viewer. Most third-party PDF libraries will reject them outright.

Why? Because /BrotliDecode isn't part of the official PDF specification yet. The PDF Association is actively working on adding it to ISO 32000 (the PDF standard), but until that's finalized and implementations roll out, Brotli PDFs exist in a gray area.

What about forward compatibility?

Here's the good news: Brotli PDFs are future-proof. Once the PDF Association finalizes the spec and vendors implement it, your existing Brotli-compressed documents will just work. You're not creating broken files—you're creating files that are ahead of their time.

Think of it like HTTP/2 in 2015. Early adopters who deployed it got immediate performance wins in their own
infrastructure, and as browsers caught up, those benefits became universal. Brotli PDFs follow the same pattern.

iText's commitment

We're not shipping this as a toy feature. We're working directly with the PDF Association to:

  • Standardize the specification (syntax, decode parameters, dictionary support)
  • Validate implementations across multiple platforms (Java, .NET, C++)
  • Contribute test suites to ensure interoperability when other vendors adopt it
  • Support migrations when the spec finalizes (we'll handle any breaking changes)

By adopting Brotli compression now, you're not taking a risk—you're investing in a proven technology that's on a clear path to standardization.

Conclusion

PDF compression hasn't evolved in 30 years—until now. Brotli represents the biggest leap in PDF storage efficiency since the format was invented, and iText is bringing it to production today.

Yes, there are compatibility limitations. Yes, it's experimental. But every standard starts this way. HTTP/2, WebP, and TLS 1.3 were all "experimental" once. Early adopters got the benefits first, then the ecosystem caught up.

By using iText's Brotli implementation now, you're:

  • Reducing storage costs by 15-25% immediately
  • Future-proofing your documents for inevitable standardization
  • Helping shape the spec with real-world feedback
  • Voting with code for a more efficient PDF ecosystem

The PDF Association is listening. Adobe is watching. And iText is leading.

Let's make PDFs smaller together. 🚀


The iText Suite is a comprehensive PDF SDK which includes iText Core and optional add-ons to give you the flexibility to fit your needs. iText Core is an open source PDF library that you can build into your own applications and is a re-imagining of the popular iText 5 engine…

Read more

WordPress Cookie Notice by Real Cookie Banner