Mahesh P
Mahesh P

Reputation: 35

Getting "java.lang.ArrayIndexOutOfBoundsException" while reading ".doc" files using Apache POI API

Here is the exception I am getting:

Exception in thread "AWT-EventQueue-0" java.lang.ArrayIndexOutOfBoundsException: 20203 at org.apache.poi.util.LittleEndian.getShort(LittleEndian.java:45) at org.apache.poi.hwpf.model.ListLevel.(ListLevel.java:120) at org.apache.poi.hwpf.model.ListFormatOverrideLevel.(ListFormatOverrideLevel.java:48) at org.apache.poi.hwpf.model.ListTables.(ListTables.java:88) at org.apache.poi.hwpf.HWPFDocument.(HWPFDocument.java:267) at org.apache.poi.hwpf.HWPFDocument.(HWPFDocument.java:157) at com.mahesh.MyFrame.readMSDocuments(MyFrame.java:301) at com.mahesh.MyFrame.readALLDocuments(MyFrame.java:276) at com.mahesh.MyFrame.access$1(MyFrame.java:269) at com.mahesh.MyFrame$2.actionPerformed(MyFrame.java:231) at javax.swing.AbstractButton.fireActionPerformed(Unknown Source) at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source) at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source) at javax.swing.DefaultButtonModel.setPressed(Unknown Source) at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(Unknown Source) at java.awt.Component.processMouseEvent(Unknown Source) at javax.swing.JComponent.processMouseEvent(Unknown Source) at java.awt.Component.processEvent(Unknown Source) at java.awt.Container.processEvent(Unknown Source) at java.awt.Component.dispatchEventImpl(Unknown Source) at java.awt.Container.dispatchEventImpl(Unknown Source) at java.awt.Component.dispatchEvent(Unknown Source) at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source) at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source) at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source) at java.awt.Container.dispatchEventImpl(Unknown Source) at java.awt.Window.dispatchEventImpl(Unknown Source) at java.awt.Component.dispatchEvent(Unknown Source) at java.awt.EventQueue.dispatchEvent(Unknown Source) at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source) at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source) at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source) at java.awt.EventDispatchThread.pumpEvents(Unknown Source) at java.awt.EventDispatchThread.pumpEvents(Unknown Source) at java.awt.EventDispatchThread.run(Unknown Source)

And this is my code

private void readMSDocuments(String fileToRead) {

    boolean containsEditorAndMt = false;
    String fileEditorAndMt = null;
    dataArray = null;
    try {

        fis = new FileInputStream(new File(fileToRead).getAbsolutePath());
        fs = new POIFSFileSystem(fis);
        document = new HWPFDocument(fs);
        wordExtractor = new WordExtractor(document);
        dataList = new ArrayList();
        dataArray = wordExtractor.getParagraphText();//getParagraphText() reads paragraphs so problem is each paragraph is read a single line .u can see in console
        System.out.println(dataArray.length);
        if (dataArray.length >= 0) {


            for (int k = 0; k < dataArray.length; ++k) {

                if (dataArray[k].trim().length() > 0) {

                    dataList.add(dataArray[k].trim());
                    //System.out.println(fileToRead+" "+dataArray[k].trim()+"\n");
                }
            }
        }

    }
}

Could anyone help me find out why the exception is thrown?

Upvotes: 1

Views: 1917

Answers (1)

GingerHead
GingerHead

Reputation: 8230

You are using Apachi POI API:
This is a bug in this domain.

You can read about this bug and analyze through this.

Upvotes: 1

Related Questions