PDF

PDF File Is Too Large: How to Compress It Without Losing Quality

A PDF that’s too large to email, upload, or share isn’t a corrupted file, but it’s a file repair problem in the practical sense — the PDF needs to be transformed into something usable. The good news is that most large PDFs can be reduced significantly without visible quality loss, because the bulk usually comes from embedded images at higher resolution than the use case actually needs. This guide covers the compression options ranked by effort and quality preservation.

Quick fix

For most large PDFs, Ghostscript with the /ebook preset produces a substantial size reduction with quality acceptable for screen viewing and most printing:

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

This downsamples embedded images to 150 DPI, which is roughly the threshold below which most viewers cannot tell the difference on a screen. A 50 MB PDF often comes out at 5–10 MB without visible quality change.

The -dPDFSETTINGS parameter accepts five values, in order of increasing quality and file size:

  • /screen — 72 DPI images. Smallest files. Quality acceptable only for on-screen viewing of unimportant material.
  • /ebook — 150 DPI images. Good balance for most uses.
  • /printer — 300 DPI images. Higher quality, larger files. Suitable for actual printing.
  • /prepress — 300 DPI with color preservation. For files going to professional printing.
  • /default — Ghostscript’s default settings, which may produce files larger than the source.

Try /ebook first. If quality is unacceptable, step up to /printer. If files are still too large, step down to /screen.

If that didn’t work

If Ghostscript compression isn’t enough or removes too much quality, a more controlled approach is to identify what’s actually large in the file and target that specifically.

Use qpdf to inspect the file’s object structure and find the largest objects — typically images, fonts, or embedded files:

qpdf --json input.pdf > structure.json

Look through the resulting JSON for objects with large sizes. Images appear as objects of type /XObject with /Subtype /Image. If a few specific images are dominating the file size, replacing them with smaller versions before re-creating the PDF is more effective than blanket compression.

For a more usable workflow, Adobe Acrobat’s Reduce File Size feature (File > Save As Other > Reduced Size PDF, or Tools > PDF Optimizer for fine-grained control) gives the same results as Ghostscript with a GUI. The PDF Optimizer in particular lets you control image downsampling thresholds, font embedding policies, and structural compression separately.

For batch workflows or programmatic compression, pikepdf provides Python access to the same engine qpdf uses, with control over object stream compression and structural optimization.

Advanced recovery

When extreme compression is needed and quality loss is acceptable, combining multiple techniques produces the smallest files:

  1. Extract images from the PDF using pdfimages (from the Poppler tools): pdfimages -all input.pdf images_.
  2. Manually downsample or recompress those images using ImageMagick or similar: magick images_001.jpg -quality 60 -resize 1024x images_001_compressed.jpg.
  3. Recreate the PDF from the compressed images using a PDF builder tool.

This is more effort than running Ghostscript with /screen, but it produces noticeably smaller files for image-heavy documents — typically scanned documents, presentations with photographs, or technical PDFs with detailed diagrams.

For text-only or text-heavy PDFs where image compression won’t help, qpdf’s structural compression options reduce size without affecting visible content:

qpdf --object-streams=generate --compress-streams=y input.pdf compressed.pdf

This packs PDF objects into compressed object streams and ensures all content streams use FlateDecode compression. The size reduction is modest — typically 5–15% — but lossless and instant. See the complete guide to qpdf for the full optimization options.

Why this happens

PDFs grow large for a small set of well-understood reasons. Knowing which one applies to your file determines which compression approach works best.

Embedded images at full resolution is the dominant cause. A photograph captured at 24 megapixels and inserted into a PDF carries all 24 megapixels into the file, even though the on-page rendered size is far smaller. Word, PowerPoint, and most PDF creators don’t downsample images automatically. A document with a dozen photographs can easily be 100 MB even if the actual content is mostly text.

Embedded fonts not subsetted is the second most common cause. A PDF that needs three characters from a font but embeds the entire font face — sometimes 10 MB or more for a complete Unicode font — is unnecessarily large. Most modern PDF creators subset fonts automatically, but older tools and some print-to-PDF drivers don’t.

Uncompressed object streams. PDF supports compressing the structural objects (not just the content) using FlateDecode (effectively zip compression). PDFs created by older tools or with compression intentionally disabled are larger than they need to be even before image and font issues.

Embedded files attached to the PDF. PDFs can contain attached files (Excel spreadsheets, source data, supporting documents) embedded as additional objects. These attachments carry their full size into the PDF. Acrobat’s File > Properties > Description shows when a PDF has attachments; PDF Optimizer can remove them.

Legacy PDF version with inefficient encoding. Older PDF specifications (1.2, 1.3) lack some compression features that newer versions support. Re-saving as PDF 1.5 or later, particularly with object streams enabled, reduces size with no quality loss.

Forms, annotations, and JavaScript. These accumulate over a PDF’s editing history. A heavily-edited PDF can carry hundreds of obsolete revision markers, deleted annotations, and old form data. Saving the file with PDF Optimizer’s “Discard Objects” cleanup options strips these.

Preventing this in future

Compress at creation, not after the fact. Most PDF creators have size-optimization options that produce smaller files from the start without the round-trip quality loss of compressing an already-large file.

In Microsoft Word: File > Save As > PDF, then in the dialog choose Minimum size (publishing online) rather than Standard. This downsamples images automatically during export.

In Adobe Acrobat when creating PDFs from scans or printouts: use Reduced Size PDF as the save target rather than full PDF.

For documents with photographs, downsample the photos to the resolution actually needed before inserting them into the source document. A photograph displayed at 4 inches wide on a printed page only needs about 1200 pixels of width at 300 DPI; anything more is wasted.

For documents that are regularly distributed by email — newsletters, reports, marketing materials — keep a “compressed” version alongside the master. Distribute the compressed version; archive the master.

If your PDF is large because it accumulates content from a merge operation that went wrong, the merged PDF file is corrupted guide covers the merge-side fix. For PDFs that became large after a recovery operation — Ghostscript output is sometimes larger than the original source if applied with the wrong settings — re-running with a compression preset rather than /default typically resolves the size growth. And if you need to compress a large batch of PDFs as part of a workflow, the qpdf complete guide covers scriptable approaches that don’t require manual processing of each file.

Last verified: April 2026