Reputation: 261
I have a question for Apache common CSVParser/CSVRecord. Take a look at the CSV file below:
Header1,Header2,Header3
"",,"L1C3"
CSVParser/CSVRecord is returning "" for the first two columns. In my case I want to distinguish empty string("") and null values. Is there a configuration I could set to let CSVParser to return null for the second column?
Upvotes: 7
Views: 13758
Reputation: 2896
I've used this format :
CSVFormat.RFC4180.withFirstRecordAsHeader()
.withIgnoreSurroundingSpaces()
.withNullString("")
Where the 2 configurations:
Here's a sample usage:
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNull;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.StringReader;
import org.junit.Test;
public class CsvParseTest {
@Test
public void testParseWillTrimAndConvertToNull() throws Exception {
String CSV_HEADER = "Name,MobileNo,Location";
String CSV_ROW_1 = "abc, ,australia"; // MobileNo is 3 whitespaces
CSVParser parse = CSVFormat.RFC4180.withFirstRecordAsHeader().withIgnoreSurroundingSpaces().withNullString("")
.parse(new BufferedReader(new StringReader(CSV_HEADER + "\n" + CSV_ROW_1)));
CsvRecord rec = parse.getRecords().get(0);
assertEquals("abc", rec.get("Name"));
assertNull(rec.get("MobileNo"));
assertEquals("australia", rec.get("Location"));
}
}
Upvotes: 9
Reputation: 4485
In Apache commons csv 1.2, we could use the CSVFormat
class method withNullString()
to convert null strings to NULL
. Here null string could be ""
or "N/A"
or "Nill"
according to your requirement.
CSVFormat csvFormat = CSVFormat.DEFAULT.withNullString("");
CSVParser csvParser = new CSVParser(fileReader, csvFormat);
This would give NULL, NULL, L1C3
for the given record in question.
Note: Empty records are automatically converted to empty strings thus resulting in converted to NULL
value at last.
Upvotes: 0
Reputation: 261
At the end I didn't find a good solution to return null with Apache Commons CSV library. I switched to OpenCSV 3.6 and here is the code I used, which I also posted on another thread. Thanks to everyone else that suggested OpenCSV.
CSVReaderBuilder has withFieldAsNull() for this purpose.
CSVReader csvReader = new CSVReaderBuilder(csvFileReader)
.withFieldAsNull(CSVReaderNullFieldIndicator.EMPTY_SEPARATORS)
.build();
Upvotes: 1
Reputation: 6289
I think uniVocity-parsers is the only library that allows you to distinguish empty strings from nulls (I know this won't address your problem with Apache Commons CSV directly, but at least there's a way to get what you need).
Here's how to do it:
public static void main(String ... args){
String input = "Header1,Header2,Header3\n" +
"\"\",,\"L1C3\"";
CsvParserSettings settings = new CsvParserSettings(); //many options here, check the tutorial.
settings.setEmptyValue("I'm empty"); //value to use when the parser finds "". Set to "" to get an empty String.
settings.setNullValue("I'm null"); //value to use when the parser finds a null value (i.e. ,,).
CsvParser parser = new CsvParser(settings);
List<String[]> allRows = parser.parseAll(new StringReader(input));
for(String[] row : allRows){
System.out.println(Arrays.toString(row));
}
}
This will produce the following output:
[Header1, Header2, Header3]
[I'm empty, I'm null, L1C3]
uniVocity-parsers is also 3 times faster than Apache Commons CSV and has way more features.
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).
Upvotes: 2