How to crop PDF margins using pdftk and /MediaBox

RockScience picture RockScience · Mar 15, 2011 · Viewed 16.1k times · Source

I used pdftk to uncompress a PDF and then opened it as a text file.
I want to edit the /MediaBox field, which is in my case

/MediaBox [0 0 612 792]

I would like to reduce the margins, for instance

/MediaBox [100 0 512 792]

Unfortunately it doesn't work. I can change the 0 into a 2 or a 9 but I cannot put 100 for instance.

Any idea why?

Answer

James Duvall picture James Duvall · Feb 25, 2013

The string 100 has two more numbers in it than 0. When you use a text editor and add characters, that makes the file longer. That's why replacing with 9 or 2 or any other single digit works fine. While a text editor can theoretically be used to edit a pdf, it's not simple and you have to respect the internal structure of the file. The xref table is a table near the end of a pdf that tells the reader exactly where each object is located. It has to be changed whenever the length or location of anything is changed.

The reason the manual method above using pdftk doesn't work is that you are adding two bytes in the center of the file. This breaks the xref table. If you manually update all the xrefs, this will work, but it is potentially very tedious. Using sed or any other text editing tool will not solve the problem. podofo does the xref calculation for you.