Reputation: 1713
I'm using PDFBox's PDPage::convertToImage
to display PDF pages in Java. I'm trying to create click-able areas on the PDF page's image based on COSObjects
in the page (namely, AcroForm fields). The problem is the PDF seems to use a completely different coordinate system:
System.out.println(field.getDictionary().getItem(COSName.RECT));
yields
COSArray{[COSFloat{149.04}, COSFloat{678.24}, COSInt{252}, COSFloat{697.68}]}
If I were to estimate the actual dimensions of the field's rectangle on the image, it would be 40,40,50,10 (x,y,width,height). There's no obvious correlation between the two and I can't seem to find any information about this with Google.
How can I determine the pixel position of a PDPage's COSObjects?
Upvotes: 2
Views: 4540
Reputation: 82461
The pdf coordinate system is not that different from the coordinate system used in images. The only differences are:
You can convert from pdf coordinates to image coordinates using these formulae:
x_image = x_pdf * width_image / width_page
y_image = (height_pdf - y_pdf) * height_image / height_pdf
To get the page size, simply use the mediabox size of the page that contains the annotation:
PDRectangle pageBounds = page.getMediaBox();
You may have missed the correlation between the array from the pdf and your image coordinate estimates, since a rectangle in pdf is represented as array [x_left, y_bottom, x_right, y_top]
.
Fortunately PDFBox provides classes that operate on a higher level than the cos structure. Use this to your advantage and use e.g. PDRectangle
you get from the PDAnnotation
using getRectangle()
instead of accessing the COSArray
you extract from the field's dictionary.
Upvotes: 8