I have a PDF form
created using Adobe LiveCycle Designer ES 10.4
. I need to fill it using Python
so that we can reduce manual labor. I searched the web and read some article most of them were focused around pdfrw
library, I tried using it and extracted some information from PDF form
as shown below
Code
from pdfrw import PdfReader
pdf = PdfReader('sample.pdf')
print(pdf.keys())
print(pdf.Info)
print(pdf.Root.keys())
print('PDF has {} pages'.format(len(pdf.pages)))
Output
['/Root', '/Info', '/ID', '/Size']
{'/CreationDate': "(D:20180822164509+05'30')", '/Creator': '(Adobe LiveCycle Designer ES 10.4)', '/ModDate': "(D:20180822165611+05'30')", '/Producer': '(Adobe XML Form Module Library)'}
['/AcroForm', '/MarkInfo', '/Metadata', '/Names', '/NeedsRendering', '/Pages', '/Perms', '/StructTreeRoot', '/Type']
PDF has 1 pages
I am not sure how further I can use pdfrw
to access the fillable fields from the PDF form and fill them using Python
is it possible. Any suggestions would be helpful.
You can find the form fields here:
pdf.Root.AcroForm.Fields
or here
pdf.Root.Pages.Kids[page_index].Annots
This is a PdfArray object. Basically a List. The Name of the field is found here:
pdf.Root.AcroForm.Fields[field_index].T
Other keys include the value .V There's a bunch of display information, like the font etc under .AP.N.Resources
However, if you update the value for a field and output the pdf file. It might only display the value when the field has focus i.e is clicked on.
I haven't figured out how to fix that yet.