Programmatically replace text in PDF

BramD picture BramD · Jul 6, 2011 · Viewed 8k times · Source

I have PDF files with text that should be replaced. More specificly, the text should be translated and replaced with the translated version. It's important that the rest of the PDF structure stays intact. Note that the text is available in the PDFs and techniques like OCr are not needed. Also, it would be nice if font and other text attributes are kept.

Which libraries would you recommend for extracting the text to an easy to edit format (such as CSV) and put the new text back in again?

Answer

Ed Bayiates picture Ed Bayiates · Jul 6, 2011

Assuming you are replacing text with a different language, you will have to choose a different font in most cases, and the font choice is non-trivial. I've used the Foxit libraries to change text or create PDFs with success.