PDFBox converting inches or centimeters into the coordinate system

Robby F picture Robby F · Feb 3, 2014 · Viewed 10.8k times · Source

I am new to PDFBox (and PDF generation) and I am having difficulty to generate my own PDF.

I do have text with certain coordinates in inches/centimeters and I need to convert them to the units PDFBox uses. Any suggestions/utilities than can do this automatically?

PDPageContentStream.moveTextPositionByAmount(x,y) is making no sense to me.

Answer

mkl picture mkl · Feb 3, 2014

In general PDFBox uses the PDF user space coordinates when creating a PDF. This means:

  1. The coordinates of a page are delimited by its CropBox defaulting to its MediaBox, the values increasing left to right and bottom to top. Thus, if you create a page using new PDPage() or new PDPage(PDPage.PAGE_SIZE_*) the origin of the coordinate system starts in the lower left corner of the page.

  2. The unit in user space starts as the default user space unit which is defined by the UserUnit of the page. Most often (e.g. if you create a page using any of the PDPage constructors and don't explicitly change that value) it is not explicitly set and, therefore, its default kicks in which is 1⁄72 inch.

  3. The user space coordinate system can be changed pretty arbitrarily by concatenating some matrix to the current transformation matrix. The current transformation matrix starts as the identity matrix.

    In PDFBox you do this using one of the PDPageContentStream.concatenate2CTM() overloads.

  4. As soon as you switch to text mode using PDPageContentStream.beginText(), the coordinate system used is furthermore influenced by the transformation introduced by the text matrix.

    In PDFBox you set the text matrix using one of the PDPageContentStream.setTextMatrix() overloads.

As you are new to PDFBox (as you say) and new to PDF in general (as I presume because otherwise you would likely have recognized the coordinates), I would advise you to initially refrain from using transformations wherever possible and, therefore, remain in state where the coordinate system starts in the lower left, is neither rotated nor skewed, and has a unit length of 1/72 inch.

For this context you actually can use constants provided by PDFBox for conversion:

  • Multiply coordinates in inch by PDPage.DEFAULT_USER_SPACE_UNIT_DPI to get default user space coordinates.
  • Multiply coordinates in mm by PDPage.MM_TO_UNITS to get default user space coordinates.

If you want to have fun with coordinates, though, look at the PDF specification ISO-32000-1 and study the sections 8.3 Coordinate Systems and 9.4.4 Text Space Details.


The PDPage constants pointed to above used to be accessible in early PDFBox 1.8.x versions but then got hidden (private), and eventually were removed in the transition to PDFBox 2.x.

For reference, the constants were defined as

private static final int DEFAULT_USER_SPACE_UNIT_DPI = 72;

private static final float MM_TO_UNITS = 1/(10*2.54f)*DEFAULT_USER_SPACE_UNIT_DPI;