How can you find a problem with a programmatically generated PDF?

Question 1

How can you find a problem with a programmatically generated PDF?

pdf pdf-generation itextsharp ghostscript

Swoop · Sep 2, 2010 · Viewed 7.7k times · Source

Answer

Answer

The "cheapest" (and at the same time quite reliable!) way is to use Ghostscript. Let Ghostscript interpret the PDF and see which return value it gives. If it has no problem, the PDF file should be OK. On Windows:

 gswin32c.exe ^
       -o nul
       -sDEVICE=nullpage ^
        d:/path/to/file.pdf

The nullpage output device will not create any new file. But Ghostscript will tell on stdout/stderr if it encounters an error. Check for the content of the %errorlevel% pseudo environment variable. -- On Linux:

 gs \
       -o /dev/null \
       -sDEVICE=nullpage \
        /path/to/file.pdf

(Check return value with echo $? for a 0 value for "no problems".)

In case of errors, Ghostscript issues some info which may be helpful to you. In any case, at least you can positively identify those files which do have NO problems: if Ghostscript can process them, Acrobat (Reader) will have no problem rendering them too.

Question 2

My group has been using the itext-sharp library and C#/.NET to generate custom, dynamic PDFs. For the most part, this process is working great for our needs. The one problem we can run into during development/testing is layout issues which can cause the PDF to not open/render correctly in Adobe Reader, esp. the newer versions of Acrobat/Reader.

The document will open the display correctly for the first X pages. But if there is an error, the remaining pages in the document will not display.

As mentioned, we are usually able to track this problem down to a layout-type issue with our C#/iText code. We eventually find the error by using the guess and check method, or divide and conquer. It works, but it doesn't feel like the best way to solve these problems.

I was wondering if there are any tools available that could speed up the process of validating a PDF document and could help to point out errors in the document?

How can you find a problem with a programmatically generated PDF?

Answer

Related questions