Want to make your PDFs 20% smaller for free?
January 20, 2026
January 20, 2026
About Guust Ysebie
For nearly three decades; or November 1996 to be exact, PDFs have relied on Deflate—the same compression algorithm that powers your ZIP files. Meanwhile, the web moved on. In 2015, Google introduced Brotli, a compression algorithm so efficient it now powers 95% of internet traffic. Websites got faster. Downloads got smaller. CDNs got cheaper.
Now PDFs are getting the same upgrade.
The PDF Association is bringing this battle-tested web compression technology into the PDF specification itself. After a decade of Brotli proving its worth across billions of web requests daily, it's now getting ready to make it's introduction into ISO 32000.
With iText, we can help drive widespread adoption with a production-ready Brotli encoder and decoder for the PDF ecosystem. The result? 15-25% smaller files with zero quality loss, using the same algorithm trusted by Google, Cloudflare, and every major CDN.
Why PDF compression has struggled to evolve
PDF compression has been stuck in 1996 for a good reason: backward compatibility is sacred. The PDF Association operates under a strict principle—any new feature must work seamlessly with existing readers, or it risks fragmenting the ecosystem. Adding a new compression algorithm isn't just a technical change; it's a breaking change that could render documents unreadable in older software. This creates a high barrier for innovation.
Beyond compatibility concerns, there are other practical challenges. The PDF specification moves slowly by design—it's an ISO standard that requires consensus by hundreds of stakeholders. Compression algorithms must be royalty-free (ruling out patented options), widely supported across platforms, and battle-tested in production.
Finally, the ecosystem is conservative: enterprises and governments rely on PDFs for archival and legal documents that must remain accessible for decades, making any breaking change a risk that needs extraordinary justification.
Encoding and decoding: Technical implementation
To get Brotli compression working within the iText SDK, we need to solve two problems: reading documents, and also writing them.
Let's start with the easiest one; reading documents.
Decoding: Advanced plumbing work
First of all, let's look at how the content of a page is stored within a PDF. We can demonstrate this with just the classic "Hello World" text example.
The following PDF syntax simply displays the text "Hello World!" on a page:
5 0 obj % Unique identifier to reference this content from other places within the PDF <</Length 49>>stream % Meta data for the stream object. Here it contains a Length value to indicate how many bytes are there after the `stream` keyword. q % the actual content BT /F1 12 Tf 37 788.33 Td (Hello World!)Tj ET Q endstream % Indicates the end of the stream object endobj % Indicates the end of the referenceable object
So, if we need to render or do anything else with the content, it would look like the following:
|------------------------|
| Get stream based on id |
|------------------------|
||
\/
|------------------------|
| Read content |
|------------------------|
||
\/
|------------------------|
| Render/Do stuff |
| with the content |
|------------------------|
Okay, so now we have a high-level view how PDF processors handle the low-level processing of those stream objects, we can dive a little deeper!
Let's take a look at the following PDF stream object where the content is encoded using the Deflate algorithm.
5 0 obj <</Filter/FlateDecode/Length 36>>stream % The meta data now now includes `Filter` xœmÍÂ0„ïûëM/1?Æl®‚âUømI)Íûºm¢... % Reduced for clarity endstream endobj
First of all, we notice there is an additional Key Filter with a value of FlateDecode in the metadata.
This can be interpreted the following way: "The content of this stream object is only usable after its FlateDecode filter is applied".
So how does this change our working implementation?
|------------------------|
| Get stream based on id |
|------------------------|
||
\/
|------------------------|
| Read content |
|------------------------|
||
\/
|------------------------|
| Decode |
| based on Filter |
|------------------------|
||
\/
|------------------------|
| Render/Do stuff |
| with the content |
|------------------------|
We can now see we require an operation on the content before it's usable. The PDF specification already provides a variety of ways to write the content of the PDF streams.
| Filter name | Description |
| ASCIIHexDecode | Decodes ASCII hexadecimal data to binary. |
| ASCII85Decode | Decodes ASCII base-85 data to binary. |
| LZWDecode | Decompresses data using LZW compression. |
| FlateDecode | Decompresses data using zlib/deflate compression. |
| RunLengthDecode | Decompresses data using run-length encoding. |
| CCITTFaxDecode | Decompresses CCITT fax-encoded monochrome images. |
| JBIG2Decode | Decompresses JBIG2-encoded monochrome image data. |
| DCTDecode | Decompresses JPEG DCT-based image data. |
| JPXDecode | Decompresses JPEG 2000 wavelet-based image data. |
| Crypt | Decrypts data encrypted by a security handler. |
So the idea for Brotli is to simply add another Filter implementation.What we need to get it working into iText is actually pretty minimal:
- Get the decoding implementation from Google's repository.
- Write some plumbing code to call it from iText
- Hook up the plumbing code to the
BrotliDecodefilter
For the first step we simply embedded Google's reference Java Brotli decoder straight from their official repository into our kernel module.
Why embed the decoder?
By embedding Google's reference implementation directly, we guarantee:
- Zero dependency hell: No version conflicts with other libraries
- Consistent behavior: Same decoder on all platforms
- Long-term stability: We control the code, even if upstream changes
- Automatically generate C# version: Using our porting mechanism we can have a C# implementation
The plumbing implementation lives in BrotliFilter.java, which plugs into iText's existing filter pipeline:
public class BrotliFilter extends MemoryLimitsAwareFilter {
@Override
public byte[] decode(byte[] b, PdfName filterName, PdfObject decodeParams,
PdfDictionary streamDictionary) {
try {
final byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];
final ByteArrayInputStream input = new ByteArrayInputStream(b);
final ByteArrayOutputStream output = enableMemoryLimitsAwareHandler(streamDictionary);
final BrotliInputStream brotliInput = new BrotliInputStream(input);
int len;
while ((len = brotliInput.read(buffer, 0, buffer.length)) > 0) {
output.write(buffer, 0, len);
}
brotliInput.close();
return output.toByteArray();
} catch (IOException e) {
throw new PdfException(KernelExceptionMessageConstant.FAILED_TO_DECODE_BROTLI_STREAM, e);
}
}
}
Let's break down what's happening in this implementation:
- Memory Safety First: The filter extends
MemoryLimitsAwareFilter, which protects against decompression
bombs—malicious PDFs that expand into gigabytes of data when decompressed. This is critical for production systems. - Wrapped Input Stream: The compressed bytes
bare wrapped in aByteArrayInputStream, which is then passed to
Google'sBrotliInputStream. This is where the magic happens—BrotliInputStreamhandles all the heavy lifting of Brotli decompression.
As you can see, writing the plumbing code is pretty easy because of iText's architecture.
The last thing to do is to ensure iText knows which implementation to associate with the /BrotliDecode filter.
This is also pretty trivial. The filter is registered automatically in FilterHandlers.java alongside /FlateDecode and the other standard PDF
filters:
public final class FilterHandlers {
private static final Map<PdfName, IFilterHandler> defaults;
static {
Map<PdfName, IFilterHandler> map = new HashMap<>();
map.put(PdfName.FlateDecode, new FlateDecodeFilter());
map.put(PdfName.Fl, new FlateDecodeFilter());
//other implementations removed for clarity
// we add our implementation
map.put(PdfName.BrotliDecode, new BrotliFilter());
defaults = Collections.unmodifiableMap(map);
}
}
That's it. From this point on, any PDF with /BrotliDecode streams just works. No configuration needed.
Now we could have stopped here—our SDK could process Brotli-compressed PDFs from other sources. But reading isn't enough. To truly bring Brotli to the PDF ecosystem, we needed to let developers create these smaller files. That meant solving the encoding problem.
And encoding turned out to be significantly more complex than decoding.
Encoding: a separate module for compression
The problem: iText’s compression was hardcoded
Before Brotli, iText only supported two compression modes for PDF streams:
- Flate compression
- No compression
This logic was baked directly into the stream-writing code—there was no abstraction, no plugin point. If you wanted to use a different compression algorithm, you were out of luck.
To support Brotli (and future algorithms), we needed to introduce a new abstraction layer: IStreamCompressionStrategy.
public interface IStreamCompressionStrategy {
/**
* Gets the PDF filter name that identifies this compression algorithm.
*
* @return the PDF name representing the compression filter
*/
PdfName getFilterName();
/**
* Gets the decode parameters required for decompressing the stream.
* <p>
* Decode parameters provide additional information needed to correctly
* decompress the stream data.
*
* @return the decode parameters as a PDF object, or {@code null} if not needed
*/
PdfObject getDecodeParams();
/**
* Creates a new output stream that wraps the original stream and applies compression.
* @param original the original output stream to wrap
* @param stream the PDF stream being compressed (may be used for context or configuration)
*
* @return a new output stream that performs compression
*/
OutputStream createNewOutputStream(OutputStream original, PdfStream stream);
}
This interface decouples compression logic from iText's core PDF writing machinery. Now, instead of hardcoding Flate everywhere, we can inject different strategies at runtime. To inject the required strategy we make use of the DiContainer. You can find more information about it here: Adding Dependency Injection to the PdfDocument class.
From now on when iText needs to compress a stream, it asks the DiContainer in the PdfDocument: "Do you have an IStreamCompressionStrategy?"
- If yes: Use the registered strategy (Brotli in this case)
- If no: Fall back to the default Flate compression
This design gives us:
- Zero coupling: iText Core no longer cares about the algorithm used
- Opt-in behavior: You only pay the cost if you use it
- Future-proof: New algorithms just implement the interface
The Second Problem: No Pure Java Encoder
Here's where things got tricky. While Google's Brotli decoder has a pure Java implementation (which we embedded for reading), the official Brotli encoder is C++ only. To use it from Java, you need:
- JNI bindings to call native code from Java
- Platform-specific native libraries (
.dllfor Windows,.sofor Linux,.dylibfor macOS) - Build infrastructure to compile and ship these libraries for every platform
For a heavily-used library like iText, shipping native binaries is a non-starter:
- Deployment complexity: Users need to manage native libraries across platforms
- Security concerns: Native code introduces attack surfaces
- Build maintenance: We'd need to compile for Windows x64, Linux ARM, macOS Silicon, etc.
- Version conflicts: What if another library ships a different Brotli version?
We needed a solution that handled this complexity outside iText's core.
That solution is a separate Maven module (brotli-compressor) that you add as an optional dependency. This
module contains:
- BrotliStreamCompressionStrategy: Implementation of
IStreamCompressionStrategy - brotli4j dependency: A third-party library that wraps Google's C++ encoder with JNI
Here's what BrotliStreamCompressionStrategy looks like:
public class BrotliStreamCompressionStrategy implements IStreamCompressionStrategy {
@Override
public OutputStream createNewOutputStream(OutputStream original, PdfStream stream) {
int compressionLevel = convertCompressionLevel(stream.getCompressionLevel());
Encoder.Parameters params = Encoder.Parameters.create(compressionLevel);
try {
return new BrotliOutputStream(original, params);
} catch (IOException e) {
throw new PdfException(KernelExceptionMessageConstant.CANNOT_WRITE_TO_PDF_STREAM, e);
}
}
@Override
public PdfName getFilterName() {
return PdfName.BrotliDecode; // This goes into the /Filter entry
}
}
The native wrapper: brotli4j
Instead of writing JNI bindings ourselves, we rely on brotli4j—a mature, well-tested library that:
- Wraps Google's official C++ Brotli encoder/decoder
- Ships pre-compiled native libraries for all major platforms (Windows x64/ARM, Linux x64/ARM, macOS Intel/Silicon)
- Automatically extracts the correct native library at runtime (no manual setup)
- Is actively maintained and widely used (powers projects like Netty, OkHttp)
By delegating to brotli4j, we get production-grade native bindings without maintaining our own JNI layer.
Why keep encoding separate?
You might ask: "Why not bundle brotli4j in the kernel module like you did with the decoder?"
Great question. Here's the reasoning:
| Aspect | Decoder (in kernel) | Encoder (separate module) |
| Necessity | Required to read Brotli PDFs | Optional—only for writing |
| Dependencies | Pure Java (Google's decoder) | Native code (brotli4j with JNI) |
| Size impact | ~300KB of Java code | ~2MB of native libraries |
| Use frequency | Every user needs to read PDFs | Most users stick with Flate |
| Backward compat | No breaking changes | Opt-in feature |
By keeping the encoder separate, we give users choice: add brotli-compressor if you need 20% smaller files, or stick with the default if native dependencies are a concern.
Putting it all together: Full example
Here's what it looks like to create a Brotli-compressed PDF:
First of all add the required dependencies. Notice you have to add iText's artifactory because of the experimental nature of the code, and so users don't accidentally enable it.
<repositories>
<repository>
<id>itext-releases</id>
<name>iText Repository - releases</name>
<url>https://repo.itextsupport.com/releases</url>
</repository>
</repositories>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>brotli-compressor</artifactId>
<version>{itext.version.bigger.then.9.5.0}</version>
</dependency>
public static void main() {
// 1. Register the compression strategy
DocumentProperties properties = new DocumentProperties();
properties.registerDependency(IStreamCompressionStrategy.class, new BrotliStreamCompressionStrategy());
// 2. Create your PDF as normal
PdfWriter writer = new PdfWriter("output.pdf");
PdfDocument pdf = new PdfDocument(writer, properties);
// Everything from here on uses Brotli automatically
Document doc = new Document(pdf);
doc.add(new Paragraph("This text will be Brotli-compressed!"));
doc.add(new Image(ImageDataFactory.create("chart.png")));
doc.close();
}
When you open output.pdf in a text editor, you'll see some entries looking like this:
5 0 obj <</Filter/BrotliDecode/Length 847>>stream [binary Brotli-compressed data] endstream endobj
The PDF now uses /BrotliDecode instead of /FlateDecode, and the file is 15-25% smaller—with zero changes to your document-building code.
The catch: Compatibility isn't universal (yet)
Here's the honest truth: Brotli-compressed PDFs won't open in Adobe Acrobat Reader today. They won't render in your browser's built-in PDF viewer. Most third-party PDF libraries will reject them outright.
Why? Because /BrotliDecode isn't part of the official PDF specification yet. The PDF Association is actively working on adding it to ISO 32000 (the PDF standard), but until that's finalized and implementations roll out, Brotli PDFs exist in a gray area.
What about forward compatibility?
Here's the good news: Brotli PDFs are future-proof. Once the PDF Association finalizes the spec and vendors implement it, your existing Brotli-compressed documents will just work. You're not creating broken files—you're creating files that are ahead of their time.
Think of it like HTTP/2 in 2015. Early adopters who deployed it got immediate performance wins in their own
infrastructure, and as browsers caught up, those benefits became universal. Brotli PDFs follow the same pattern.
iText's commitment
We're not shipping this as a toy feature. We're working directly with the PDF Association to:
- Standardize the specification (syntax, decode parameters, dictionary support)
- Validate implementations across multiple platforms (Java, .NET, C++)
- Contribute test suites to ensure interoperability when other vendors adopt it
- Support migrations when the spec finalizes (we'll handle any breaking changes)
By adopting Brotli compression now, you're not taking a risk—you're investing in a proven technology that's on a clear path to standardization.
Conclusion
PDF compression hasn't evolved in 30 years—until now. Brotli represents the biggest leap in PDF storage efficiency since the format was invented, and iText is bringing it to production today.
Yes, there are compatibility limitations. Yes, it's experimental. But every standard starts this way. HTTP/2, WebP, and TLS 1.3 were all "experimental" once. Early adopters got the benefits first, then the ecosystem caught up.
By using iText's Brotli implementation now, you're:
- Reducing storage costs by 15-25% immediately
- Future-proofing your documents for inevitable standardization
- Helping shape the spec with real-world feedback
- Voting with code for a more efficient PDF ecosystem
The PDF Association is listening. Adobe is watching. And iText is leading.
Let's make PDFs smaller together. 🚀
The iText Suite is a comprehensive PDF SDK which includes iText Core and optional add-ons to give you the flexibility to fit your needs. iText Core is an open source PDF library that you can build into your own applications and is a reimagining of the popular iText 5 engine…
Read more


