KishanCS
KishanCS

Reputation: 1377

Convert DOC [HWPFDocument] to pdf [with font, Table and images] using java

converting doc file to pdf

I am using the following code :

        POIFSFileSystem fs = null;
        Document Pdfdocument = new Document();

        fs = new POIFSFileSystem(new FileInputStream(srcFile));
        HWPFDocument doc = new HWPFDocument(fs);
        WordExtractor we = new WordExtractor(doc);

        PdfWriter writer = PdfWriter.getInstance(Pdfdocument, new 
        FileOutputStream(targetFile));

        Pdfdocument.open();
        writer.setPageEmpty(true);
        Pdfdocument.newPage();
        writer.setPageEmpty(true);
        String[] paragraphs = we.getParagraphText();
        for (int i = 0; i < paragraphs.length; i++) {
            Pdfdocument.add(new Paragraph(paragraphs[i]));
        }

This generates a pdf without formatting and images even fonts will be missing.

Since WordExtractor uses only text is there any other way to convert with fonts and images. Convertion form doc(HWPFDocument) but not on docx

I have referred these SO links

Convert doc to pdf using Apache POI

https://stackoverflow.com/a/6210694/6032482

how to convert doc,docx files to pdf in java programatically

and many more but found they all use WordExtractor .

Note: I can't use library office nor Aspose

Can it be done using:

ApachePOI

DOCX4j

itext

Upvotes: 2

Views: 4674

Answers (0)

Related Questions