Java compact string equalsIgnoreCase with non-compact string evaluates to false

Question

When parsing a .csv file I iterate over the file's column headers and see if one of them equals (ignoring case) comparand id:

String comparand = "id";
for (String header : headerMap.keySet()) {
   if (header.equalsIgnoreCase(comparand)) {
      recordMap.put("_id", csvRecord.get(header));
   } else {
      recordMap.put(header, csvRecord.get(header));
   }
}

The file is read using the UTF-8 charset:

Reader reader = new InputStreamReader(file.getInputStream(), StandardCharsets.UTF_8);

The CSV parser library I use is Apache Commons CSV:

CSVParser csvParser = CSVFormat.DEFAULT
   .withDelimiter(delimiter)
   .withFirstRecordAsHeader()
   .withIgnoreEmptyLines()
   .parse(reader);

Map headerMap = csvParser.getHeaderMap();

Somehow the above equalsIgnoreCase() evaluates to false while both strings have the value id.

Observing the debugger shows that the header value is a non-compact string (UTF-16) whereas the comparand value is a compact string (ASCII):

Is this default behavior or a bug? How can I make the equalsIgnoreCase evaluate to true as one would expect?

Karol Dowbecki · Accepted Answer

Your header value starts with an UTF-16 BOM FFFE. Strip the BOM when reading the header before comparing it with comparand.

Java compact string equalsIgnoreCase with non-compact string evaluates to false

Answers (1)

Related Questions