Convert multipage PDF to PNG and back (Linux)

Marty Fried picture Marty Fried · Mar 14, 2012 · Viewed 33k times · Source

I have a lot of PDF documents that I want to convert to PNG, edit in Gimp, and then save back to the multipage Acrobat file. I'm filling out forms and adding scanned signature, trying to avoid printing, signing, then scanning back in, with the ability to type the information I need to enter.

I've been trying to use Imagemagick to convert to png files, which seems to work fine. I use the command convert -quality 100 -density 300x300 multipage.pdf single%d.png
(I'm not really sure if the quality parameter is right for png).

But I'm having problems with saving back to PDF. Some of the files have the wrong page size, and I've tried every command and procedure I can find, but there are always a few odd sizes. The resolution seems to vary so that it looks good at a certain zoom level, but either a few pages are specified at about 2" wide, or they are 8.5x11 but the others are about 35" wide. I've tried making sure Gimp had the canvass size and resolution correct, and to save the resolution in the file, but that doesn't seem to matter.

The command I use to save the files is convert -page letter -adjoin single*.png multipage.pdf I've tried other parameters, but none seemed to matter.

If anyone has any ideas or alternatives, I'd appreciate it.

Answer

Kurt Pfeifle picture Kurt Pfeifle · Aug 21, 2012

"I'm not really sure if the quality parameter is right for PNG."

For PNG output, the -quality setting is very unlike JPEG's quality setting (which simply is an integer from 0 to 100).

For PNG it is composed by two single digits:

  • The first digit (tens) is (largely) the zlib compression level, and it may go from 0 to 9.
    (However the setting of 0 has a special meaning: when you use it you'll get Huffman compression, not zlib compression level 0. This is often better... Weird but true.)

  • The second digit is the PNG data encoding filter type (before it is compressed):

    • 0 is none,
    • 1 is "sub",
    • 2 is "up",
    • 3 is "average",
    • 4 is "Paeth", and
    • 5 is "adaptive".

In practical terms that means:

  • For illustrations with solid sequences of color a "none" filter (-quality 00) is typically the most appropriate.
  • For photos of natural landscapes an "adaptive" filtering (-quality 05) is generally the best.

"I'm having problems with saving back to PDF. Some of the files have the wrong page size, and I've tried every command and procedure I can find [...] but either a few pages are specified at about 2" wide, or they are 8.5x11 but the others are about 35" wide."

Not having available your PNG files, I created a few simple ones with different dimensions to verify the different commands (as I wasn't sure myself any more). Indeed, the one you used:

convert -page letter -adjoin single*.png multipage.pdf

does create all PDF pages in (same) letter size, but it places my sample of (differently sized) PNGs always on the lower left corner of the PDF page. (Should a PNG exceed the PDF page size, it does scale them down to make them fit -- but it doesn't scale up smaller PNGs to fill the available page space.)

The following modification to the command will place the PNGs into the center of each PDF page:

convert           \
  -page letter    \
  -adjoin         \
   single*.png    \
  -gravity center \
   multipage.pdf

If this is still not good enough for you, you can enforce a (possibly non-proportional!) scaling to almost fill the letter area by adding a -scale '590!x770!' parameter (this will leave a border of 11 pt at each edge of the page):

convert              \
  -page letter       \
  -adjoin            \
   single*.png       \
  -gravity center    \
  -scale '590!x770!' \
   multipage.pdf

To leave away the extra border, use -scale '612!x792!'. -- Should you want only upward scaling to happen if required while keeping the aspect ratio of the PNG, use -scale '590<x770<':

convert              \
  -page letter       \
  -adjoin            \
   single*.png       \
  -gravity center    \
  -scale '590<x770<' \
   multipage.pdf