DiffPDFc

DiffPDFc is used to compare two PDF files—textually or visually.

icon

DiffPDFc is a windows console (command line) program for comparing two PDF files.

DiffPDFc can say whether two PDFs are the same or different, and it can optionally output one or more reports that show any differences. The reports can be textual in .csv (Excel), .json, and .xml format and visual in .png and .pdf format. Multiple reports can be output at the same time.

Comparisons can be made based on the text regardless of layout, or based on appearance (which accounts for fonts, colors, layout, diagrams, images, etc.)

"Your software is the best I've ever used for comparing PDFs.",
—Customer comments on DiffPDF/DiffPDFc

DiffPDFc is useful for anyone who needs to compare PDF documents, reports, books, or labels—for example, archivists, engineers, journalists, packagers, publishers, researchers, software testers, and translators. DiffPDFc is used by many kinds of organization, including banks, insurance companies, and Government.

If you require an interactive graphical user interface (GUI) tool, use our DiffPDF application instead.

Try or Buy

You can try DiffPDFc free for up to 20 days using a trial license key. And you can buy a full license key for DiffPDFc that has no time limit for $160 USD (or local equivalent for many major currencies) using the secure MyCommerce platform. Tiered price discounts are applied if you buy at least 10 license keys—these are shown when you click Buy Now.

DiffPDFc will work with all modern versions of Windows (XP, Vista, 7, and 8) whether 32-bit or 64-bit (on x86-compatible processors, i.e., most desktop and laptop computers).

  1. Download the DiffPDFc installer DiffPDFc-4.1.1-win32.msi (20 MB; MD5 30ce1eff5c3b37993b731d1e489b1045).
    (You may get an invalid warning about the .msi file—we have asked Microsoft to fix this.)
  2. Double-click the installer and follow the on-screen instructions to install DiffPDFc.
  3. DiffPDFc can be used once a license key has been registered: run DiffPDFc in a windows console with the ‑‑register option, and in the Register window that pops up, click either the Free Trial button or the Buy Now button to open a web browser window.

If your security settings prevent the buttons from working, use one of these links: Free Trial or Buy Now.

We recommend trying before buying, since license key purchases can't be refunded.

The manual can be read by running the program with the -m or --manual command line option (providing a PDF viewer such as Acroread is installed). Or the manual can be opened directly: it is usually installed in "C:\Program Files (x86)\DiffPDFc\doc\". You can also view the manual online at DiffPDFc-4.pdf (~320 KB). The license is available at diffpdfc-license.pdf (26 KB) and from within the program.

Example Usage

Here is an example use of the program:

C:\Users\mark>diffpdfc -Hc -r report.csv -r report.pdf -r report-.png oldfile.pdf newfile.pdf
text different on 2 pages
wrote 'report.csv'
wrote 'report.pdf'
wrote 'report-1.png'
wrote 'report-2.png'

The -Hc option tells DiffPDFc to highlight changes (i.e., insertions, deletions, and replacements) rather than to use plain highlighting. (This option can be made the default by using a diffpdfc.dpc file as explained in the manual.) Here, the program has been asked to produce reports in three different formats. The PNG report has one file per pair of pages while the other reports have all the pages in a single report file. Here is report.csv:

Example Text Report (CSV format)

File,Page,X1,Y1,X2,Y2,Text,Change
oldfile.pdf,2,42.50,197.71,63.18,209.22,The,delete
oldfile.pdf,2,65.95,197.71,101.29,209.22,winner,delete
oldfile.pdf,2,104.45,197.71,113.11,209.22,is,delete
oldfile.pdf,2,116.05,197.71,132.73,209.22,the,delete
oldfile.pdf,2,135.50,197.71,167.63,209.22,player,delete
oldfile.pdf,2,170.85,197.71,192.19,209.22,with,delete
oldfile.pdf,2,194.65,197.71,211.33,209.22,the,delete
oldfile.pdf,2,214.15,197.71,240.15,209.22,most,delete
oldfile.pdf,2,243.10,197.85,274.24,209.22,linked,delete
oldfile.pdf,2,278.65,197.71,316.52,209.22,pieces.,delete
oldfile.pdf,2,259.55,212.11,292.23,223.62,empty,replace
oldfile.pdf,2,276.85,570.81,296.04,582.32,"i.e.,",replace
oldfile.pdf,2,513.40,570.81,552.75,582.32,already,replace
oldfile.pdf,2,42.50,585.21,105.86,596.72,surrounding,replace
oldfile.pdf,2,291.60,599.61,352.30,611.12,surrounded,replace
oldfile.pdf,2,354.70,599.61,382.36,611.12,itself.,replace
newfile.pdf,2,525.80,197.71,553.14,209.22,emp-,replace
newfile.pdf,2,42.50,212.11,51.84,223.62,ty,replace
newfile.pdf,2,42.50,230.81,63.18,242.32,The,insert
newfile.pdf,2,65.95,230.81,101.29,242.32,winner,insert
newfile.pdf,2,104.45,230.81,113.11,242.32,is,insert
newfile.pdf,2,116.00,230.81,132.68,242.32,the,insert
newfile.pdf,2,135.45,230.81,167.58,242.32,player,insert
newfile.pdf,2,170.75,230.81,192.09,242.32,with,insert
newfile.pdf,2,194.55,230.81,211.23,242.32,the,insert
newfile.pdf,2,214.00,230.81,240.00,242.32,most,insert
newfile.pdf,2,242.95,230.81,274.09,242.32,linked,insert
newfile.pdf,2,276.65,230.81,314.52,242.32,pieces.,insert
newfile.pdf,2,275.15,589.51,288.80,601.02,for,replace
newfile.pdf,2,292.20,589.51,340.38,601.02,"example,",replace
newfile.pdf,2,42.50,603.91,77.85,615.42,beside,replace
newfile.pdf,2,272.15,618.31,296.82,629.82,itself,replace
newfile.pdf,2,299.85,618.31,363.88,629.82,surrounded.,replace

By default, two decimal places are used for coordinates, but this can be changed to suit. DiffPDFc can also output text reports in JSON and XML formats.

Note that when DiffPDFc is told to do appearance comparisons, the textual reports indicate the square areas that differ (with a customizable square size).

Example Visual Report (PNG format)

Here is an extract from the report-2.png image:

DiffPDFc report

By default, change bars are shown in red, and changes are highlighted with deletions in red, insertions in cyan, and replacements in magenta. In this example, the red and cyan highlighting indicates text that has moved (i.e., been deleted from one place and inserted in another). And the magenta highlighting indicates text that has been replaced in its original position. (Note that if plain rather than change highlighting is used—which is the default—all differences are shown in yellow.) All colors can be changed, as can the width of the change bar and many other things—all this is explained in the manual.

DiffPDFc can also produce a PDF report which contains each pair of differing pages.

Command Line Interface

Here is DiffPDFc's help output.

Note that the program supports many more options and customizations than are available via the command line.
These can be set by creating a plain text diffpdfc.dpc configuration file as explained in the manual.

C:\Users\mark>diffpdfc -h
usage: diffpdfc.py [-h] [-a | -c | -w] [-H {p,plain,c,changes}] [-r FILE] [-q]
                   [-V] [--pages1 PAGES1] [--pages2 PAGES2] [-s SCALE]
                   [--no-normalize-hyphens] [--no-report-pretty] [-p INT]
                   [-C FILE.dpc]
                   pdf1 pdf2

Compares two PDF files and reports any differences.

positional arguments:
  pdf1                  first PDF file to compare
  pdf2                  second PDF file to compare

optional arguments:
  -h, --help            show this help message and exit
  -a, --appearance      compare pages by their appearance
  -c, --chars           compare pages character by character
  -w, --words           compare pages word by word [default]
  -H {p,plain,c,changes}, --highlight {p,plain,c,changes}
                        [default: plain]
  -r FILE, --report FILE
                        where to report differences to. Supported formats:
                        .csv .jsn .json .pdf .png .xml. Repeat this option to
                        produce multiple reports at the same time
  -q, --quiet           only print error messages
  -V, --verbose         also print progress messages

advanced optional arguments:
  --pages1 PAGES1       pages to compare from pdf1, e.g., 1-5,7,9,11-14 (page
                        numbers start from 1) [default: all]
  --pages2 PAGES2       pages to compare from pdf2, e.g., 1-6,8-9,12-13, or
                        'same' to use the same page numbers as --pages1
                        [default: all]
  -s SCALE, --scale SCALE
                        scale for visual reports as a percentage [25-400;
                        default: 100]
  --no-normalize-hyphens
                        don't treat all kinds of hyphens as the same
  --no-report-pretty    output text reports as compactly as possible
  -o, --use-old-renderer
                        use old renderer for PNG and PDF reports
  -p INT, --pairs INT   how many pairs of pages to compare as a unit. 
                        [1-1000; default: 1]. This is experimental.
  -C FILE.dpc, --config FILE.dpc
                        read configuration options from the given file (in
                        addition to any diffpdfc.dpc files)
  --cores CORES         Use CORES cores [4 cores detected]

alternative usage (one argument only) after which the program exits:
  -m, --manual          show the manual (with full details of many more
                        customization options) in a PDF viewer
  --register            register a trial or full license key
  --license             show the license
  -v, --version         show the version

Copyright © Qtrac Ltd 2013-14. All Rights Reserved.
Full license key registered to Mark Summerfield.

Old Versions

We always recommend using the most recent release. These are kept available for those who want to use specific previous versions. (For details of the changes between versions, see the Changes page.)

Top