I'd like to write a small script (which will run on a headless Linux server) that reads a PDF, highlights text that matches anything in an array of strings that I pass, then saves the modified PDF. I imagine I'll end up using something like the python bindings to poppler but unfortunately there's next to zero documentation and I have next to zero experience in python.
If anyone could point me to a tutorial, example, or some helpful documentation to get me started it would be greatly appreciated!
Have you tried looking at PDFMiner? It sounds like it does what you want.