Created Bylucky
Created Bylucky

Reputation: 255

Unable to read entire cell of a .doc file using Apache POI

In one of my projects I need to read images from a .doc file using Apache POI. For each row there is a cell containing an images(one, two, three, etc. ) which I need to read out along side with text data.

So I tried the following code

 FileInputStream fileInputStream = new FileInputStream(file);

    POIFSFileSystem poifsFileSystem = new POIFSFileSystem(fileInputStream);

    HWPFDocument doc = new HWPFDocument(poifsFileSystem);

    Range range = doc.getRange();

    PicturesTable pictureTable = doc.getPicturesTable();

    PicturesSource pictures = new PicturesSource(doc);

    Paragraph tableParagraph = range.getParagraph(0);

        Table table = range.getTable(tableParagraph);

            TableRow row = table.getRow(0);

            TableCell cell1 = row.getCell(0);
            for (int j = 0; j < cell1.getParagraph(0).numCharacterRuns(); j++) {

                CharacterRun cr = cell1.getParagraph(0).getCharacterRun(j);
                if (pictureTable.hasPicture(cr)) {
                    logger.debug("Has picture If--");
                    Picture picture = pictures.getFor(cr);
                    logger.debug("pictures Description--" + picture.getDescription());

                }
             }

Now I am able to read images of a particular cell, but the problem is I am not able to read all the images of a cell means, I am able to read image before the text and image in between the text, but I am not able to read the image which is followed by the text. Example "image_1---some text---image_2 some text---.image_3". Now in this case I am not able to read image_3 only. What should I do, So I can read image_3 also. I searched a lot but no luck till now. Hope someone knows the way to do this. Thanks in Advance.

Upvotes: 0

Views: 232

Answers (1)

JensS
JensS

Reputation: 1151

With the HWPFDocument, I am having problems, too. If you have a chance to change the Word documents to docx before processing, here's an example that works with XWPFDocuments:

    FileInputStream fileInputStream = new FileInputStream(file);

    XWPFDocument doc = new XWPFDocument(fileInputStream);
    for (XWPFTable tbl : doc.getTables()) {
        for (XWPFTableRow row : tbl.getRows()) {
            for (XWPFTableCell cell : row.getTableCells()) {
                for (XWPFParagraph para : cell.getParagraphs()) {
                    for (XWPFRun run : para.getRuns()) {
                        for (XWPFPicture pic : run.getEmbeddedPictures()) {
                            System.out.println(pic.getPictureData());
                        }

                    }
                }
            }
        }
    }

Upvotes: 1

Related Questions