LessThanTrue
LessThanTrue

Reputation: 21

Encoding problems with PDF

I have a (quite simple) java Spring Boot/REST service that renders PDF from input and testing it with IntelliJ.

I use pdfbox as the tool to create such pdfs.

One feature is that the client can give annexes as byte[] in addition to the regular content it wants.

Problem

When users tries the service, the final document has blank pages only for the annexes part.

Investigation

When I notice that with postman it's working great, I changed the IntelliJ default file encoding for the response file that is generated (from UTF-8 to ISO-8859-1) and then successive documents are clear and correct... Don't forget that this problem seems to only affect annexes. The regular content is always fine.

Question

Other Information

I tried many bytes conversion without success, for instance:

new String(annexe, StandardCharsets.ISO_8859_1).getBytes(StandardCharsets.UTF_8);

But each time I got an exception:

java.io.IOException: java.util.zip.DataFormatException: invalid stored block lengths

The document is sent back as byte[] like this:

ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
pdfDocument.save(outputStream);
pdfDocument.close();
return outputStream.toByteArray();

Saving the document into a file is quite the same code, just a FileOutputStream is given instead.

Annexes are added to the document like this:

for(byte[] content : annexes) {
    PDDocument annex = PDDocument.load(content);
    for (PDPage page : annex .getPages()) {
        pdfDocument.importPage(page);
    }
}

I also tried the PDFMergerUtility but got the same result (blank pages for annexes)

Upvotes: 0

Views: 5492

Answers (1)

LessThanTrue
LessThanTrue

Reputation: 21

Thanks to Tilman Hausherr suggestion, I tried to encode the byte[] with Base64.getEncoder().encode(...) and this does the work!

The client has to deal with a Base64 encoded string now but it works at least.

Thank you!

Upvotes: 1

Related Questions