Reputation: 3038
I'm using PDFClown-0.2.0 to flatten this pdf file. This is the code I have:
import org.pdfclown.documents.Document;
import org.pdfclown.files.File;
import org.pdfclown.files.SerializationModeEnum;
import org.pdfclown.tools.FormFlattener;
public class Sample {
public static void main(String args[]){
try {
File f = new File("label.pdf");
Document doc = f.getDocument();
FormFlattener formFlattener = new FormFlattener();
formFlattener.flatten(doc);
f.save(SerializationModeEnum.Standard);
} catch (Exception e) {
e.printStackTrace();
}
}
}
I'm following the instruction provided at http://pdfclown.org/2014/09/12/waiting-for-pdf-clown-0-2-0-release/#FormFlattening. However, when I run the code, I get the following error:
java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:653)
at java.util.ArrayList.get(ArrayList.java:429)
at org.pdfclown.objects.PdfArray.get(PdfArray.java:314)
at org.pdfclown.documents.interaction.forms.FieldWidgets.get(FieldWidgets.java:135)
at org.pdfclown.documents.interaction.forms.FieldWidgets$1.next(FieldWidgets.java:380)
at org.pdfclown.documents.interaction.forms.FieldWidgets$1.next(FieldWidgets.java:1)
at org.pdfclown.tools.FormFlattener.flatten(FormFlattener.java:74)
at com.narvar.webservices.returns.retailers.Sample.main(Sample.java:18)
What am I doing wrong? Just a note that the pdf was generated using PDFBox, and I had made the form fields readonly.
Upvotes: 1
Views: 440
Reputation: 95918
Having debugged into the code it looks like a PdfClown bug:
The Iterator
returned by org.pdfclown.documents.interaction.forms.FieldWidgets.iterator()
does not recognize that the widget collection underneath has changed (gotten smaller) and so tries to read beyond its size.
org.pdfclown.tools.FormFlattener.flatten(Document)
iterates over the widgets of a field:
for(Widget widget : field.getWidgets())
but inside this loop it removes the current widget from the Kids of the current field:
// Removing the field references relating the widget...
PdfDictionary fieldPartDictionary = widget.getBaseDataObject();
while (fieldPartDictionary != null)
{
[...]
kidsArray.remove(fieldPartDictionary.getReference());
[...]
}
Thus, the collection over which the outer for
iterates changes. Unfortunately the Iterator
used here is not aware of changes in the base collection
return new Iterator<Widget>()
{
/** Index of the next item. */
private int index = 0;
/** Collection size. */
private final int size = size();
@Override
public boolean hasNext( )
{return (index < size);}
@Override
public Widget next( )
{
if(!hasNext()) throw new NoSuchElementException();
return get(index++);
}
@Override
public void remove( )
{throw new UnsupportedOperationException();}
};
As you see it not merely neither is informed nor is checking itself the base collection, it even has its own idea about the collection size which is the size of the collection at Iterator
generation set in size
.
Such an Iterator
implementation is ok for non-changing collections which can be enforced by architecture or by contract. But in the case at hand here I see neither, the architecture obviously allows the collection to change, and there is no hint that the iterator in question may be used only for stable base collections.
This should be fixed.
A solution can be attempted by changing FormFlattener.flatten
to retrieve a local copy of the widgets and iterate over this copy, e.g. by replacing
for(Widget widget : field.getWidgets())
with
List<Widget> widgets = new ArrayList<Widget>(field.getWidgets());
for(Widget widget : widgets)
Upvotes: 2