Java pdfBox: Fill out pdf form, append it to pddocument, and repeat

Andrew picture Andrew · Mar 31, 2015 · Viewed 16k times · Source

I have a pdf form made and I'm trying to use pdfBox to fill in the form and print the document. I got it working great for 1 page print jobs but i had to try and modify for multiple pages. Basically it's a form with basic info up top and a list of contents. Well if the contents are larger than what the form has room for I have to make it a multiple page document. I end up with a document with a nice page one and then all the remaining pages are the blank template. What am I doing wrong?

PDDocument finalDoc = new PDDocument();
File template = new File("path/to/template.pdf");

//Declare basic info to be put on every page
String name = "John Smith";
String phoneNum = "555-555-5555";
//Get list of contents for each page
List<List<Map<String, String>>> pageContents = methodThatReturnsMyInfo();

for (List<Map<String, String>> content : pageContents) {
    PDDocument doc = new PDDocument().load(template);
    PDDocumentCatlog docCatalog = doc.getDocumentCatalog();
    PDAcroForm acroForm = docCatalog.getAcroForm();

    acroForm.getField("name").setValue(name);
    acroForm.getField("phoneNum").setValue(phoneNum);

    for (int i=0; i<content.size(); i++) {
        acroForm.getField("qty"+i).setValue(content.get(i).get("qty"));
        acroForm.getField("desc"+i).setValue(content.get(i).get("desc"));
    }

    List<PDPage> pages = docCatalog.getAllPages();
    finalDoc.addPage(pages.get(0));
}

//Then prints/saves finalDoc

Answer

mkl picture mkl · Apr 2, 2015

There are two major issues in you code:

  • The AcroForm element of a PDF is a document level object. You only copy the filled-in template page into finalDoc. Thus, the form fields are added to finalDoc only as annotations of their respective page but they are not added to the AcroForm of finalDoc.

    This is not apparent in Adobe Reader but form filling services often identify available fields from the document level AcroForm entry and don't search the pages for additional form fields.

  • The actual show stopper: You add fields with identical names to the PDF. But PDF forms are document-wide entities. I.e. there can be only a single field entity with a given name in a PDF. (This field entity may have multiple visualizations aka widgets but this requires you to construct a single field object with multiple kid widgets.Furthermore these widgets are expected to display the same value which is not what you want...)

    Thus, you have to rename the fields uniquely before adding them to the finalDoc.

Here a simplified example which works on a template with only one field "SampleField":

byte[] template = generateSimpleTemplate();
Files.write(new File(RESULT_FOLDER,  "template.pdf").toPath(), template);

try (   PDDocument finalDoc = new PDDocument(); )
{
    List<PDField> fields = new ArrayList<PDField>();
    int i = 0;

    for (String value : new String[]{"eins", "zwei"})
    {
        PDDocument doc = new PDDocument().load(new ByteArrayInputStream(template));
        PDDocumentCatalog docCatalog = doc.getDocumentCatalog();
        PDAcroForm acroForm = docCatalog.getAcroForm();
        PDField field = acroForm.getField("SampleField");
        field.setValue(value);
        field.setPartialName("SampleField" + i++);
        List<PDPage> pages = docCatalog.getAllPages();
        finalDoc.addPage(pages.get(0));
        fields.add(field);
    }

    PDAcroForm finalForm = new PDAcroForm(finalDoc);
    finalDoc.getDocumentCatalog().setAcroForm(finalForm);
    finalForm.setFields(fields);

    finalDoc.save(new File(RESULT_FOLDER, "form-two-templates.pdf"));
}

As you see all fields are renamed before they are added to finalForm:

field.setPartialName("SampleField" + i++);

and they are collected in the list fields which finally is added to the finalForm AcroForm:

    fields.add(field);
}
...
finalForm.setFields(fields);