read, highlight, save PDF programmatically

python linux pdf poppler

Jake · Sep 30, 2011 · Viewed 7.9k times · Source

I'd like to write a small script (which will run on a headless Linux server) that reads a PDF, highlights text that matches anything in an array of strings that I pass, then saves the modified PDF. I imagine I'll end up using something like the python bindings to poppler but unfortunately there's next to zero documentation and I have next to zero experience in python.

If anyone could point me to a tutorial, example, or some helpful documentation to get me started it would be greatly appreciated!

Answer

Have you tried looking at PDFMiner? It sounds like it does what you want.

read, highlight, save PDF programmatically

Answer

Related questions