Apuranic
Apuranic

Reputation: 39

read docx document using java

I have a project steganography to hide docx document into jpeg image. Using apache POI, I can run it and read docx document but only letters can be read.

Even though there are pictures in it.

Here is the code

FileInputStream in = null;
    try
    {
        in = new FileInputStream(directory);
        XWPFDocument datax = new XWPFDocument(in);
        XWPFWordExtractor extract = new XWPFWordExtractor(datax);
        String DataFinal = extract.getText();
        BufferedReader reader = new BufferedReader(new InputStreamReader(in));
        String line = null;
        this.isi_file = extract.getText();
    }
    catch (IOException x) {}
        System.out.println("isi :" + this.isi_file);

How can I read all component in the docx document using java? Please help me and thank you for your helping.

Upvotes: 1

Views: 733

Answers (1)

Alexey Prudnikov
Alexey Prudnikov

Reputation: 1123

Please check documentation for XWPFDocument class. It contains some useful methods, for example:

In your code snippet exists line XWPFDocument datax = new XWPFDocument(in);. So after that line your can write some code like:

// process all pictures in document
for (XWPFPictureData picture : datax.getAllPictures()) {
    // get each picture as byte array
    byte[] pictureData = picture.getData();
    // process picture somehow
    ...
}

Upvotes: 3

Related Questions