Reputation: 255
In one of my projects I need to read images from a .doc file using Apache POI. For each row there is a cell containing an images(one, two, three, etc. ) which I need to read out along side with text data.
So I tried the following code
FileInputStream fileInputStream = new FileInputStream(file);
POIFSFileSystem poifsFileSystem = new POIFSFileSystem(fileInputStream);
HWPFDocument doc = new HWPFDocument(poifsFileSystem);
Range range = doc.getRange();
PicturesTable pictureTable = doc.getPicturesTable();
PicturesSource pictures = new PicturesSource(doc);
Paragraph tableParagraph = range.getParagraph(0);
Table table = range.getTable(tableParagraph);
TableRow row = table.getRow(0);
TableCell cell1 = row.getCell(0);
for (int j = 0; j < cell1.getParagraph(0).numCharacterRuns(); j++) {
CharacterRun cr = cell1.getParagraph(0).getCharacterRun(j);
if (pictureTable.hasPicture(cr)) {
logger.debug("Has picture If--");
Picture picture = pictures.getFor(cr);
logger.debug("pictures Description--" + picture.getDescription());
}
}
Now I am able to read images of a particular cell, but the problem is I am not able to read all the images of a cell means, I am able to read image before the text and image in between the text, but I am not able to read the image which is followed by the text. Example "image_1---some text---image_2 some text---.image_3". Now in this case I am not able to read image_3 only. What should I do, So I can read image_3 also. I searched a lot but no luck till now. Hope someone knows the way to do this. Thanks in Advance.
Upvotes: 0
Views: 232
Reputation: 1151
With the HWPFDocument, I am having problems, too. If you have a chance to change the Word documents to docx before processing, here's an example that works with XWPFDocuments:
FileInputStream fileInputStream = new FileInputStream(file);
XWPFDocument doc = new XWPFDocument(fileInputStream);
for (XWPFTable tbl : doc.getTables()) {
for (XWPFTableRow row : tbl.getRows()) {
for (XWPFTableCell cell : row.getTableCells()) {
for (XWPFParagraph para : cell.getParagraphs()) {
for (XWPFRun run : para.getRuns()) {
for (XWPFPicture pic : run.getEmbeddedPictures()) {
System.out.println(pic.getPictureData());
}
}
}
}
}
}
Upvotes: 1