1392023093user
1392023093user

Reputation: 1096

How to read empty, but formated, Excel cells with Apache POI?

I have a method for reading Excel cells using Apache POI, and it works fine. Well... almost fine.

public static ArrayList readXLsXFile() throws FileNotFoundException, IOException {

        ArrayList outListaExcel = new ArrayList();

        FileInputStream fis;
        ptxf= new FileInputStream(pathToExcelFile);
        XSSFWorkbook workbook = new XSSFWorkbook(ptxf);
        XSSFSheet sheetAr = workbook.getSheetAt(0);
        Iterator rowsAr = sheetAr.rowIterator();
        while (rowsAr.hasNext()) {
            XSSFRow row1 = (XSSFRow) rowsAr.next();
            Iterator cellsAr = row1.cellIterator();
            ArrayList<String> arr;
            arr = new ArrayList();
            while (cellsAr.hasNext()) {
                XSSFCell cell1 = (XSSFCell) cellsAr.next();
                arr.add(String.valueOf(cell1));
            }
            outListaExcel.add(arr);
        }
        return outListaExcel;
    }

If cells are formatted, for example if whole A column have borders, then it will keep reading empty cells giving me empty strings. How to ignore those empty(formated) cells?

So readXLsXFile will give me an ArryList with

[0] -> [1][2]
[1] -> [3][4] 

But it will also give ten more nodes with empty strings,because coloumn A is formated with borders.

edit after Gagravarr answer.

I can avoid checking wether subList is empty and then do not add it to mainList. But in the case of some very large .xls files and if there is many of them it will take too long, and generaly I think it is not a good practice.

My question was if there is something for rows, like it is for cells that I have overlooked.

 ArrayList<ArrayList<String>>mainLista = new ArrayList<ArrayList<String>>();
for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
        Row r = sheet.getRow(rowNum);
        int lastColumn = r.getLastCellNum();
        ArrayList<String> subList = new ArrayList<String>();
        for (int cn = 0; cn < lastColumn; cn++) {
            Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);

            if (c != null) {
                subList.add(c.getStringCellValue());
            } else {
            }
        }
        if (!subList.isEmpty() ){  // I think it is not good way
        mainLista.add(subList);}   // to do this, because it still reads 
    }                              // an empty rows  

Upvotes: 4

Views: 5168

Answers (2)

mario
mario

Reputation: 731

set the border for column B, in my case it helped me

Upvotes: 0

Gagravarr
Gagravarr

Reputation: 48376

As explained in the Apache POI Documentation on Iterate over rows and cells, the iterators only give you the rows and cells which are defined and have/had content.

If you want to fetch cells with full control over blank or empty cells, you need to instead use something like:

// Decide which rows to process
int rowStart = Math.min(15, sheet.getFirstRowNum());
int rowEnd = Math.max(1400, sheet.getLastRowNum());

for (int rowNum = rowStart; rowNum < rowEnd; rowNum++) {
   Row r = sheet.getRow(rowNum);

   int lastColumn = Math.max(r.getLastCellNum(), MY_MINIMUM_COLUMN_COUNT);

   for (int cn = 0; cn < lastColumn; cn++) {
      Cell c = r.getCell(cn, Row.RETURN_BLANK_AS_NULL);
      if (c == null) {
         // The spreadsheet is empty in this cell
      } else {
         // Do something useful with the cell's contents
      }
   }
}

If you want to fetch blank cells (typically those with styling but no values), play with the other Missing Cell Policies, eg RETURN_NULL_AND_BLANK

Upvotes: 2

Related Questions