How to fill PDF forms using Python

Atinesh picture Atinesh · Sep 21, 2018 · Viewed 9.4k times · Source

I have a PDF form created using Adobe LiveCycle Designer ES 10.4. I need to fill it using Python so that we can reduce manual labor. I searched the web and read some article most of them were focused around pdfrw library, I tried using it and extracted some information from PDF form as shown below

Code

from pdfrw import PdfReader
pdf = PdfReader('sample.pdf')
print(pdf.keys())
print(pdf.Info)
print(pdf.Root.keys())
print('PDF has {} pages'.format(len(pdf.pages)))

Output

['/Root', '/Info', '/ID', '/Size']
{'/CreationDate': "(D:20180822164509+05'30')", '/Creator': '(Adobe LiveCycle Designer ES 10.4)', '/ModDate': "(D:20180822165611+05'30')", '/Producer': '(Adobe XML Form Module Library)'}
['/AcroForm', '/MarkInfo', '/Metadata', '/Names', '/NeedsRendering', '/Pages', '/Perms', '/StructTreeRoot', '/Type']
PDF has 1 pages

I am not sure how further I can use pdfrw to access the fillable fields from the PDF form and fill them using Python is it possible. Any suggestions would be helpful.

Answer

Eddie picture Eddie · Dec 1, 2018

You can find the form fields here:

pdf.Root.AcroForm.Fields

or here

pdf.Root.Pages.Kids[page_index].Annots

This is a PdfArray object. Basically a List. The Name of the field is found here:

pdf.Root.AcroForm.Fields[field_index].T

Other keys include the value .V There's a bunch of display information, like the font etc under .AP.N.Resources

However, if you update the value for a field and output the pdf file. It might only display the value when the field has focus i.e is clicked on.

I haven't figured out how to fix that yet.