Reputation: 411
I'm trying to read records for both CSV as well as Excel format using Apache CSVParser. Below is my approach for the same.
try (BufferedReader reader = new BufferedReader(new InputStreamReader(file.getInputStream(),
StandardCharsets.UTF_8)) {
CSVParser parser;
String extension = StringUtils.getFilenameExtension(file.getOriginalFilename());
logger.info("Extension Detected : {}", extension);
if (extension != null && extension.equals("xlsx")) {
parser = new CSVParser(reader,
CSVFormat.EXCEL.withFirstRecordAsHeader().withTrim());
} else if(extension != null && extension.equals("csv")){
parser = new CSVParser(reader, CSVFormat.DEFAULT.withFirstRecordAsHeader()
.withIgnoreHeaderCase().withTrim());
} else {
throw new InvalidRequestException(Collections.singletonList(INVALID_EXTENSION));
}
logger.info("Records parsed successfully for file : {}", file.getOriginalFilename());
return parser.getRecords();
} catch (IOException e) {
throw new FileReadException(e);
}
This is working for CSV Files but not for Excel. Parser is not able to read the records properly. Here's the debugged value of parser.
I feel it's related to reusableToken because in case of CSV, Token Type is TOKEN , But in case of Excel it's EORecord.
Can anybody please help?
Upvotes: 1
Views: 2874
Reputation: 1217
Apache CSVParser can't parse Excel files, only csv files. Using the CSVFormat.EXCEL
parameter just tells CSVParser that the csv file was created/exported using Excel.
To read xls & xlsx files you need to use Apache POI.
Upvotes: 3