Vincent Couturier
Vincent Couturier

Reputation: 526

How to read real numeric values instead of formatted value using Apache XSSF POI streaming API?

I use streaming POI API and would like to read the real value of a cell instead of the formatted one. My code which is below works fine but if the user doesn't display all the digit of a value in the excel sheet which is readed by my code, I've got the same truncated value in my result. I didn't find any solution in the streaming API - which is needed in my case to solve memory issue I had using the POI API without streaming.

    /**
     * @see org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.SheetContentsHandler cell(java.lang.String,
     *      java.lang.String)
     */
    @Override
    void cell(String cellReference, String formattedValue, XSSFComment comment) { 
       useTheCellValue(formattedValue) 
    }

Upvotes: 1

Views: 970

Answers (2)

Axel Richter
Axel Richter

Reputation: 61870

If you are constructing the XSSFSheetXMLHandler you can provide a DataFormatter. So if you are creating your own DataFormatter this DataFormatter could give you fully access to the formatting issues.

Example of how this could look like by changing the public void processSheet of the XLSX2CSV example in svn:

...
public void processSheet(
        StylesTable styles,
        ReadOnlySharedStringsTable strings,
        SheetContentsHandler sheetHandler, 
        InputStream sheetInputStream) throws IOException, SAXException {
    //DataFormatter formatter = new DataFormatter();
    DataFormatter formatter = new DataFormatter(java.util.Locale.US) {
        //do never formatting double values but do formatting dates
        public java.lang.String formatRawCellContents(double value, int formatIndex, java.lang.String formatString) {
            if (org.apache.poi.ss.usermodel.DateUtil.isADateFormat(formatIndex, formatString)) {
                return super.formatRawCellContents(value, formatIndex, formatString);
            } else {
                //return java.lang.String.valueOf(value);
                return super.formatRawCellContents(value, 0, "General");
            }
        }
    };
    InputSource sheetSource = new InputSource(sheetInputStream);
    try {
        XMLReader sheetParser = SAXHelper.newXMLReader();
        ContentHandler handler = new XSSFSheetXMLHandler(
              styles, null, strings, sheetHandler, formatter, false);
        sheetParser.setContentHandler(handler);
        sheetParser.parse(sheetSource);
     } catch(ParserConfigurationException e) {
        throw new RuntimeException("SAX parser appears to be broken - " + e.getMessage());
     }
}
...

Upvotes: 3

Vincent Couturier
Vincent Couturier

Reputation: 526

I've seen a ticket on POI about this point : https://bz.apache.org/bugzilla/show_bug.cgi?id=61858

It provides a first solution by changing the existing class.

This could be an interesting workaround even if the ideal solution should be to use a standard one.

Upvotes: 0

Related Questions