Reputation: 12176
I got stuck in one area. I need to identify the positions of the PDAcroForm
fields in one pdf. I need to do some processing with the x and y value of the fields.
Any idea how to do this? Is the information present in the COS object?
Upvotes: 8
Views: 11192
Reputation: 70
Use below code for the latest PdfBox dependency releases
private PDRectangle getFieldArea(PDField field) {
COSDictionary fieldDict = field.getCOSObject();
COSArray fieldAreaArray = (COSArray) fieldDict.getDictionaryObject(COSName.RECT);
PDRectangle rectangle = new PDRectangle(fieldAreaArray);
System.out.println(rectangle);
return rectangle;
}
Upvotes: 1
Reputation: 181
The accepted answer does not work anymore. I have tried the approach and received NullPointerException
for some elements. In PDFBOX 2.x, you can obtain the rectangle without querying the COS object tree directly.
The information about the field position is stored in PDAnnotationWidget
. There can be more widgets associated with the field. To obtain the first one (without checking whether these is one).
PDRectangle rectangle = field.getWidgets().get(0).getRectangle();
To obtain all rectangles (in cases there can be more):
List<PDRectangle> rectangles = field.getWidgets().stream().map(PDAnnotation::getRectangle).collect(Collectors.toList());
Upvotes: 8
Reputation: 1167
I had the same problem today. The following code works in my case:
private PDRectangle getFieldArea(PDField field) {
COSDictionary fieldDict = field.getDictionary();
COSArray fieldAreaArray = (COSArray) fieldDict.getDictionaryObject(COSName.RECT);
float left = (float) ((COSFloat) fieldAreaArray.get(0)).doubleValue();
float bottom = (float) ((COSFloat) fieldAreaArray.get(1)).doubleValue();
float right = (float) ((COSFloat) fieldAreaArray.get(2)).doubleValue();
float top = (float) ((COSFloat) fieldAreaArray.get(3)).doubleValue();
return new PDRectangle(new BoundingBox(left, bottom, right, top));
}
Edit: karthicks code is shorter. So I use this code now:
private PDRectangle getFieldArea(PDField field) {
COSDictionary fieldDict = field.getDictionary();
COSArray fieldAreaArray = (COSArray) fieldDict.getDictionaryObject(COSName.RECT);
PDRectangle result = new PDRectangle(fieldAreaArray);
return result;
}
And you can use this code if you want to test that the returned rectangle is correct:
private void printRect(final PDPageContentStream contentStream, final PDRectangle rect) throws IOException {
contentStream.setStrokingColor(Color.YELLOW);
contentStream.drawLine(rect.getLowerLeftX(), rect.getLowerLeftY(), rect.getLowerLeftX(), rect.getUpperRightY()); // left
contentStream.drawLine(rect.getLowerLeftX(), rect.getUpperRightY(), rect.getUpperRightX(), rect.getUpperRightY()); // top
contentStream.drawLine(rect.getUpperRightX(), rect.getLowerLeftY(), rect.getUpperRightX(), rect.getUpperRightY()); // right
contentStream.drawLine(rect.getLowerLeftX(), rect.getLowerLeftY(), rect.getUpperRightX(), rect.getLowerLeftY()); // bottom
contentStream.setStrokingColor(Color.BLACK);
}
Upvotes: 13
Reputation: 12176
I am able to get the details like this
COSDictionary trailer = document.getDocument().getTrailer();
COSDictionary root = (COSDictionary) trailer.getDictionaryObject(COSName.ROOT);
COSDictionary acroForm = (COSDictionary) root.getDictionaryObject(COSName.getPDFName("AcroForm"));
if (null != acroForm) {
COSArray fields1 = (COSArray) acroForm.getDictionaryObject(COSName.getPDFName("Fields"));
for (int l = 0; l < fields1.size(); l++) {
COSDictionary field = (COSDictionary) fields1.getObject(l);
COSArray rectArray= (COSArray)field.getDictionaryObject("Rect");
PDRectangle mediaBox = new PDRectangle( rectArray );
System.out.println("mediaBox: " + mediaBox.getLowerLeftX() +"||" +mediaBox.getLowerLeftY());
System.out.println("mediaBox: " + mediaBox.getUpperRightX() +"||" + mediaBox.getUpperRightY());
Upvotes: 2