arcy
arcy

Reputation: 13123

Using less memory filling an PDF Form, with flattening, using iText

I have a web application that uses a couple of PDF forms to create documents of up to 500 pages; each form is one page and has 40-50 fields on it. The completed document is display-and-print only, there is no need to retain the fill-in aspect of the PDF form as the document is being created.

I have working code using iText 1.4.5; it creates these documents in less than 30 seconds (websphere, MVS) which is fine for my purposes.

The app does use significant amounts of memory, and recently contributed to a server crash. I am interested in whether I can modify the existing code to keep most of its attributes and use significantly less memory. It seems to me it should be possible, given that the amount of memory used suggests the entire document is in memory until completed, and my logic has no need for that -- once a page is filled, my program is done with it, it could be written to disk and any memory associated just with that page freed.

I have found reference to the method com.lowagie.text.pdf.PdfWriter.freeReader(), but am not sure how to use it in my environment. My question is whether it would cause my program to use less memory (at one time) and where to put the call.

I create iText Document , PdfWriter, and PdfReader objects as follows:

public PdfFormFiller(String givenInputSpecification, 
                        Document givenDocument, 
                        PdfWriter givenWriter) 
{
  // instance fields stored for PDF or tracking purposes.
  inputSpecification = givenInputSpecification;
  document = givenDocument;
  writer = givenWriter;
  contentByte = writer.getDirectContent();
  // 'DirectContentUnder' is a contentByte object that allows
  // our app to write out document content that appears
  // underneath things written to the DirectContentOver; i.e.,
  // this is a layer underneath some other things.
  underContent = writer.getDirectContentUnder();

  try
  {
    PdfReader reader = new PdfReader(inputSpecification);
    template = writer.getImportedPage(reader, 1);           // this line limits us to 1-page forms;
    AcroFields aFields = reader.getAcroFields();            // the fields on the form.
  <<more stuff in this constructor, deleted from here>>  

I fill in values in the form using this:

/**
 * * 'Fill' this given form with the given data values, i.e., write the given data
 * values onto the positions in the forms corresponding to their field names. 
 * @param fieldValueMap a map with each key the name
 * of the data field, and each value the string to be put on
 * the form for that field.  
 */
public void fillForm(Map fieldValueMap) throws DocumentException
{
  Iterator keys = fieldValueMap.keySet().iterator();
  while (keys.hasNext())
  {
    String fieldName = (String)keys.next();
    FormField formField = (FormField)fields.get(fieldName);
    String value = null;
    if (fieldName != null)
      {
        value = (String)fieldValueMap.get(fieldName);
      }
    if (null != value && null != formField)
    {
      fillField(formField, value);
    }
  }
  // add the template of the form; the fact that it is added
  // to "underContent" causes iText to put it in a list if it's
  // not already there, so it only gets added once per doc.
  underContent.addTemplate(getTemplate(), 0, 0);

  // start a new page - throws DocumentException
  document.newPage();
}

And I write a value to a field using this:

/**
 * fills the given field with the given value
 * @param formField field and attributes
 * @param value String value
 */
private void fillField(FormField formField, String value) throws DocumentException
{
  if (formField.fieldType == AcroFields.FIELD_TYPE_CHECKBOX)
  {
    if (value.substring(0,1).equalsIgnoreCase("Y")) { value = "X"; } 
                                                else { value = " "; }
  }

  ColumnText columnText = new ColumnText(contentByte); 

  <<excised code determining fontToUse>>

        setSimpleColumn(columnText, value, fontToUse, formField.box,
                            leading, Element.ALIGN_LEFT, false);
}

'setSimpleColumn()' is a utlility routine handling the fitting of text into a rectangle on the form.

private int setSimpleColumn(ColumnText columnText, String value, Font fontToUse, 
                                Rectangle box, int leading, int alignment, boolean simulate)
    throws DocumentException
{
  columnText.setSimpleColumn(new Phrase(value, fontToUse),
        box.left(), box.bottom(),
        box.right(), box.top(),
       leading, alignment
      );
  int result = columnText.go(simulate);
  return result;
}

So again, the main 2 questions are: would use of PdfWriter.freeReader() help free up memory that's otherwise held until the document is complete, and (2) where would I put a call to it?

If someone wants to tell me how to do multi-page forms, I'm interested in that as well...

Upvotes: 4

Views: 2311

Answers (2)

VPaul
VPaul

Reputation: 1013

Here are the following three steps that worked for me:

  • Freeing the memory occupied by the writer. Please refer to this link.

Merging 1000 PDF thru iText throws java.lang.OutOfMemoryError: Java heap space

Which explains how to use PdfWriter's freeMemory() method.

  • Secondly, you can save memory by reading pdf using RandomAccessFileOrArray

    PdfReader pdfReader = new PdfReader(new RandomAccessFileOrArray(pdf), null);
    

instead of

PdfReader pdfReader = new PdfReader(pdf);
  • Finally you can System.gc() to fire java's automatic garbage collection utility.

Upvotes: 4

James-Jesse Drinkard
James-Jesse Drinkard

Reputation: 15703

I'm not seeing the code that loops through the documents, but PdfWriter.freeReader() will free up memory when you are concatenating multiple documents. Here is the javadoc explanation:

Use this method to writes the reader to the document and free the memory used by it. The main use is when concatenating multiple documents to keep the memory usage restricted to the current appending document.

So is that what you are doing?

As simple as it sounds, what I think you need is to close each document as you loop through the processing, something like:

        //loop iteration
        // step 1
        Document document = new Document();
        // step 2
        PdfWriter.getInstance(document, new FileOutputStream(filename));
        // step 3
        document.open();
        // step 4
        document.add(new Paragraph("Hello World!"));
        //process the document.
        ...
        //save the document.
        ...
        // step 5
        document.close();
        //next loop iteration

Since you don't need to save each document, would it work to combine 20 or 30 forms at a time as a single pdf, close it out, then create another 20 or 30 forms, do the same and then combine/merge the final document with these other create documents to avoid holding everything open until the end?

Upvotes: 1

Related Questions