spurohit
spurohit

Reputation: 15

Newer PDFBox versions loads PDF slowly

So I am working on migrating a legacy framework using PDFBox 1.8 to PDFBox 2.x However, I am observing a spike in the time required to load the pdf with the newer versions which almost doubles up ( 100ms on older to 200ms on newer). Now the app that we run is very sensitive to latency.

I tried searching out for the increase in latency but no luck. Just wanted help from the community to find out if there is any possible way to resolve this. The only line of code involved is -

PDDocument pdfDoc = PDDocument.load(new File(pdfFilePath));

What I have already tried is-

  1. Playing with the memoryUsageSettings and setting it to no restriction, using temp file, and a combination of both main memory and temp file.
  2. Comparing the stats on a number of versions of 2.x including the latest release, all seem to be higher than the older version.

Thanks in Advance!!

Upvotes: 1

Views: 996

Answers (1)

Tilman Hausherr
Tilman Hausherr

Reputation: 18861

There are initializations that are done when the first document is opened (fonts, colorspaces, some class loading), see also discussion in PDFBOX-3988. Use this code (taken from PDFDebugger sources) so that they are done before the first loading.

// trigger premature initializations for more accurate rendering benchmarks
// See discussion in PDFBOX-3988
if (PDType1Font.COURIER.isStandard14())
{
    // Yes this is always true
    PDDeviceCMYK.INSTANCE.toRGB(new float[] { 0, 0, 0, 0} );
    PDDeviceRGB.INSTANCE.toRGB(new float[] { 0, 0, 0 } );
    IIORegistry.getDefaultInstance();
    FilterFactory.INSTANCE.getFilter(COSName.FLATE_DECODE);
}

Upvotes: 2

Related Questions