I am having an issue with trying to set some from fields using Apache PDFBOX(1.8.5). I have a few different Static PDFs that I am using for testing. Using the following code, I can set the values of form fields, and save the resulting PDF. I can then open this PDF in Adobe Reader and see the results:
PDDocumentCatalog docCatalog = pdfDocument.getDocumentCatalog();
pdfTemplate.setAllSecurityToBeRemoved(true);
PDAcroForm acroForm = docCatalog.getAcroForm();
List fields = acroForm.getFields();
Iterator fieldsIter = fields.iterator();
while( fieldsIter.hasNext())
{
PDField field = (PDField)fieldsIter.next();
if(field instanceof PDTextbox){
((PDTextbox)field).setValue("STATIC PDFBOX EDIT");
}
}
And then I eventually save the form. For Static PDFs of:
This works just fine. I can open the Documents in Adobe Reader XI and see the correct values in the form.
For Static PDFs of:
This appears to not be working. When I open the resulting forms in Adobe Reader XI, the fields do not appear to be populated. But If I open the PDF in my Firefox or Chrome browser's PDF viewer, the fields show as populated there.
How can I set these fields so the values will appear when viewed in Adobe Reader XI?
EDIT: Sample PDFs can be found here: https://github.com/bamundson/PDFExample
The major difference between your PDFs is the form technology used:
Test_9.pdf
uses good ol'fashioned AcroForm forms;Test_10.pdf
and Test_10.pdf
on the other hand use a hybrid form with both an AcroForm representation and a XFA (Adobe XML Forms Architecture) representation.XFA-aware PDF viewers (i.e. foremost Adobe Reader and Adobe Acrobat) use the XFA information from the file while XFA-unaware viewers (i.e. most others) use the AcroForm information.
PDFBox is mostly XFA-unaware. This means especially that the PDField
objects returned by PDAcroForm.getFields()
only represent the AcroForm information. Thus, your ((PDTextbox)field).setValue("STATIC PDFBOX EDIT")
calls only influence the AcroForm representation of the form.
This explains your observation
When I open the resulting forms in Adobe Reader XI, the fields do not appear to be populated. But If I open the PDF in my Firefox or Chrome browser's PDF viewer, the fields show as populated there.
(As far as I know Firefox and Chrome integrated PDF viewers are XFA-unaware.)
So,
How can I set these fields so the values will appear when viewed in Adobe Reader XI?
There essentially are two ways:
Remove the XFA entry from the AcroForm dictionary:
acroForm.setXFA(null);
If there is no XFA, Adobe Reader will use the AcroForm form information, too.
Edit both the AcroForm and the XFA information. You can retrieve the XFA information using
PDXFAResource xr = acroForm.getXFA();
and extract the underlying XML using
xr.getDocument()
Then you can edit the XML, put the resulting XML into a stream which you can wrap in a PDXFAResource
which you then can set using AcroForm.setXFA(...)
.
While option 1 certainly is much easier to implement, it only works for hybrid documents. If you also will have to edit pure XFA forms, you'll need to implement option 2.
Writing new field values to these PDFs works fine with the latest version of iText
iText has a certain degree of explicit support for XFA forms.