"Java Apache POI: Cannot get a STRING value from a NUMERIC cell - how to handle mixed data types?"

Question

In my Java application, I am using Apache POI to read data from an Excel file. The data structure consists of three columns in the following format:

Column A: Text (e.g., product codes)
Column B: Text (e.g., product descriptions)
Column C: Numeric (e.g., product prices)

However, when my program attempts to read the data, I encounter the following error:

Exception in thread "main" java.lang.IllegalStateException: Cannot get a STRING value from a NUMERIC cell 
        at org.apache.poi.xssf.usermodel.XSSFCell.typeMismatch(XSSFCell.java:984)
        at org.apache.poi.xssf.usermodel.XSSFCell.getRichStringCellValue(XSSFCell.java:311)
        at org.apache.poi.xssf.usermodel.XSSFCell.getStringCellValue(XSSFCell.java:280)
        at com.example.ExcelReader.readExcel(ExcelReader.java:37)
        at com.example.Main.main(Main.java:26)

Code causing the issue (line 37 in ExcelReader.readExcel()):

String keyValue = getCellValueAsString(keyCell).trim().replaceAll("\s+", "");

Code causing the issue (line 26 in Main.main()):

Map excelData = ExcelReader.readExcel(excelFilePath);

But i think that problem could be somewhere else because when i put in line 37 default System.out.println("Whatever"); application throw same exception in exactly same line

Build.gradle

plugins {
    id 'java'
    id 'application'
    id 'org.beryx.jlink' version '2.26.0'
}

application {
    mainClass = 'com.example.Main'
}

group = 'com.example'
version = '1.0'

java {
    sourceCompatibility = JavaVersion.VERSION_17
    targetCompatibility = JavaVersion.VERSION_17
    modularity.inferModulePath = true
}

ext.moduleName = 'com.example'

tasks.compileJava {
    inputs.property('moduleName', moduleName)
    doFirst {
        options.compilerArgs = [
                '--module-path', classpath.asPath
        ]
        classpath = files()
    }
}

tasks.withType(JavaCompile).configureEach {
    options.encoding = 'UTF-8'
}

repositories {
    mavenCentral()
}

dependencies {
    implementation 'org.apache.pdfbox:pdfbox:2.0.30'
    implementation 'org.apache.poi:poi-ooxml:5.3.0'
    implementation 'org.apache.commons:commons-math3:3.6'
    implementation 'org.apache.logging.log4j:log4j-core:2.20.0'
    implementation 'org.apache.logging.log4j:log4j-api:2.20.0'


    testImplementation platform('org.junit:junit-bom:5.10.3')
    testImplementation 'org.junit.jupiter:junit-jupiter'
}

tasks.jar {
    archiveBaseName = 'app'
}

tasks.register('copyDependencies', Copy) {
    def outputDir = layout.buildDirectory.dir('libs')
    from configurations.runtimeClasspath
    into outputDir
}

jlink {
    launcher {
        name = 'my-app'
    }
    jpackage {
        installerType = 'exe' // lub 'msi', 'dmg', 'pkg', 'deb', 'rpm'
        installerOptions = [
                '--description', 'Aplikacja do przetwarzania plików PDF i Excel',
                '--vendor', 'MojaFirma',
                '--win-per-user-install',
                '--win-dir-chooser',
                '--win-shortcut'
        ]
        imageOptions = ['--icon', 'src/main/resources/icon.ico']
    }
}

test {
    useJUnitPlatform()
}

ExcelReader.java

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;

public class ExcelReader {
    public static Map readExcel(String excelFilePath) throws IOException {
        Map dataMap = new HashMap<>();
        File file = new File(excelFilePath);

        if (!file.exists()) {
            throw new FileNotFoundException("Plik z danymi nie istnieje: " + excelFilePath);
        }

        try (FileInputStream fis = new FileInputStream(file);
             Workbook workbook = new XSSFWorkbook(fis)) {
            Sheet sheet = workbook.getSheetAt(0);
            for (Row row : sheet) {
                Cell keyCell = row.getCell(0);
                Cell equivalentCell = row.getCell(1);
                Cell priceCell = row.getCell(2);

                // Logowanie typów danych każdej komórki
                logCellType("IFS", keyCell);
                logCellType("ILME", equivalentCell);
                logCellType("Cena", priceCell);

                if (keyCell != null && equivalentCell != null && priceCell != null) {
                    // Pobranie wartości z komórek
                    String keyValue = getCellValueAsString(keyCell).trim().replaceAll("\s+", "");
                    String equivalentValue = getCellValueAsString(equivalentCell).trim();
                    String priceValue = getCellValueAsString(priceCell)
                            .replace(" zł", "").replace(",", ".").trim();

                    dataMap.put(keyValue, new String[]{equivalentValue, priceValue});
                    System.out.println("Read from Excel: " + keyValue + " -> " + equivalentValue + ", " + priceValue);
                }
            }
        }
        return dataMap;
    }

    private static String getCellValueAsString(Cell cell) {
        if (cell == null) {
            return "";
        }
        switch (cell.getCellType()) {
            case STRING:
                return cell.getStringCellValue().trim().replaceAll("\s+", " ");
            case NUMERIC:
                if (DateUtil.isCellDateFormatted(cell)) {
                    return cell.getDateCellValue().toString();
                } else {
                    return String.format("%.2f", cell.getNumericCellValue());
                }
            case BOOLEAN:
                return String.valueOf(cell.getBooleanCellValue());
            case FORMULA:
                try {
                    return String.valueOf(cell.getNumericCellValue());
                } catch (IllegalStateException e) {
                    return cell.getStringCellValue();
                }
            case BLANK:
            default:
                return "";
        }
    }

    private static void logCellType(String columnName, Cell cell) {
        if (cell == null) {
            System.out.println(columnName + " column: EMPTY (null)");
            return;
        }

        switch (cell.getCellType()) {
            case STRING:
                System.out.println(columnName + " column: STRING value -> " + cell.getStringCellValue());
                break;
            case NUMERIC:
                System.out.println(columnName + " column: NUMERIC value -> " + cell.getNumericCellValue());
                break;
            case BOOLEAN:
                System.out.println(columnName + " column: BOOLEAN value -> " + cell.getBooleanCellValue());
                break;
            case FORMULA:
                System.out.println(columnName + " column: FORMULA value -> " + cell.getCellFormula());
                break;
            case BLANK:
                System.out.println(columnName + " column: BLANK");
                break;
            default:
                System.out.println(columnName + " column: UNKNOWN TYPE");
                break;
        }
    }
}

1. The first part of my application, which involves reading and extracting specific patterns from a pdf file, is working correctly. The program successfully identifies and extracts the required sequences of characters that follow the pattern xxxx-xxxx-xxx-xxx (where x represents digits) from the PDF file.

Main.java

import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.List;
import java.util.Map;

public class Main {
    public static void main(String[] args) {
        if (args.length < 3) {
            System.out.println(
                    "Użycie: java -jar my-app.jar   "
            );
            System.exit(1);
        }

        String inputFilePath = args[0];
        String excelFilePath = args[1];
        String outputFilePath = args[2];

        try {
            System.out.println("Rozpoczynam przetwarzanie pliku PDF...");
            List foundPatterns = PDFProcessor.findPatternsInPDF(inputFilePath);

            System.out.println("Odczytuję dane z pliku Excel...");
            Map excelData = ExcelReader.readExcel(excelFilePath);

            System.out.println("Zapisuję wyniki do pliku...");
            FileWriterUtil.writeToFile(outputFilePath, foundPatterns, excelData);

            System.out.println("Proces zakończony pomyślnie.");
        } catch (FileNotFoundException e) {
            System.err.println(e.getMessage());
        } catch (IOException e) {
            System.err.println("Błąd podczas przetwarzania plików: " + e.getMessage());
        }
    }
}

I have two parts of my application that are working correctly:

PDF Processing (PDFProcessor class):

This part of the application successfully reads and extracts specific patterns from the PDF file using the regex pattern \d{4}-\d{4}-\d{3}-\d{3} (where x represents digits).
The extracted values are correctly logged and passed to the next processing stage.

File Writing (FileWriterUtil class):

This part of the application functions as expected.
Before expanding the application, it correctly created the output file and wrote the expected results (e.g., xxxx-xxxx-xxx-xxx, brak, brak), where "brak" indicates that no matching data was found in the Excel file.

[Columns A, B, and C contain sample data:

Column A (A1): 1234-1234-123-123 (text value in a specific format).
Column B (B1): CDMA 1.0 (text value with a number).
Column C (C1): 100,00 zł (numeric value formatted as currency with "zł" symbol).

Columns E, F, and G contain formulas that check the data type of the cells in columns A, B, and C:

E1: ISTEXT(A1) → In cell F1, the result is PRAWDA (TRUE), indicating that A1 is text.
E2: ISTEXT(B1) → In cell F2, the result is PRAWDA (TRUE), indicating that B1 is also text.
E3: ISNUMBER(C1) → In cell F3, the result is PRAWDA (TRUE), confirming that C1 is recognized as a numeric value.

Column G contains the English equivalents of the results, showing TRUE for all checks, confirming that the formulas have returned expected results.]1

"Java Apache POI: Cannot get a STRING value from a NUMERIC cell - how to handle mixed data types?"

Build.gradle

ExcelReader.java

Main.java

Answers (0)

Related Questions

&quot;Java Apache POI: Cannot get a STRING value from a NUMERIC cell - how to handle mixed data types?&quot;

Build.gradle

ExcelReader.java

Main.java

Answers (0)

Related Questions

"Java Apache POI: Cannot get a STRING value from a NUMERIC cell - how to handle mixed data types?"