Reputation: 2115
After some quick Googling, I found an easy way to read and parse a CSV file to JSON using the Jackson library. All well and good, except ... some of the CSV header column names have embedded newlines. The program handles it, but I'm left with JSON keys with newlines embedded within. I'd like to remove these (or replace them with a space).
Here is the simple program I found:
import java.io.File;
import java.util.List;
import java.util.Map;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
public class CSVToJSON {
public static void main(String[] args) throws Exception {
File input = new File("PDM_BOM.csv");
File output = new File("output.json");
CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).build();
CsvMapper csvMapper = new CsvMapper();
// Read data from CSV file
List<Object> readAll = csvMapper.readerFor(Map.class).with(csvSchema).readValues(input)
.readAll();
ObjectMapper mapper = new ObjectMapper();
// Write JSON formated data to output.json file
mapper.writerWithDefaultPrettyPrinter().writeValue(output, readAll);
// Write JSON formated data to stdout
System.out.println(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(readAll));
}
}
So, as an example:
PARENT\nITEM\nNUMBER
Here's an example of what is produced:
"PARENT\nITEM\nNUMBER" : "208E8840040",
I need this to be:
"PARENT ITEM NUMBER" : "208E8840040",
Is there a configuration setting on the Jackson mapper that can handle this? Or, do I need to provide some sort of custom "handler" to the mapper?
Special cases
To add some complexity, there are cases where just replacing the newline with a space will not always yield what is needed.
Example 1:
Sometimes there is a column header like this:
QTY\nORDER/\nTRANSACTION
In this case, I need the newline removed and replaced with nothing, so that the result is:
QTY ORDER/TRANSACTION
, not
QTY ORDER/ TRANSACTION
Example 2:
Sometimes, for whatever reason, a column header has a space before the newline:
EFFECTIVE \nTHRU DATE
This needs to come out as:
EFFECTIVE THRU DATE
, not
EFFECTIVE THRU DATE
Any ideas on how to handle at least the main issue would be very much appreciated.
Upvotes: 0
Views: 2201
Reputation: 2115
OK, came up with a solution. It's ugly, but it works. Basically, after the CsvMapper
finishes, I go through the giant ugly collection that's produced and do a String.replaceAll
(thanks to https://stackoverflow.com/users/4402505/prem-kurian-philip for that suggestion) to remove the unwanted characters and then rebuild the map.
In any case here's the new code:
public class CSVToJSON {
public static void main(String[] args) throws Exception {
File input = new File("PDM_BOM.csv");
File output = new File("output.json");
CsvSchema csvSchema = CsvSchema.builder().setUseHeader(true).build();
CsvMapper csvMapper = new CsvMapper();
// Read data from CSV file
List<Object> readData = csvMapper.readerFor(Map.class).with(csvSchema).readValues(input)
.readAll();
for (Object mapObj : readData) {
LinkedHashMap<String, String> map = (LinkedHashMap<String, String>) mapObj;
List<String> deleteList = new ArrayList<>();
LinkedHashMap<String, String> insertMap = new LinkedHashMap<>();
for (Object entObj : map.entrySet()) {
Entry<String, String> entry = (Entry<String, String>) entObj;
String oldKey = entry.getKey();
String newKey = oldKey.replaceAll("[\n\s]+", " ");
String value = entry.getValue();
deleteList.add(oldKey);
insertMap.put(newKey, value);
}
// Delete the old ...
for (String oldKey : deleteList) {
map.remove(oldKey);
}
// and bring in the new
map.putAll(insertMap);
}
ObjectMapper mapper = new ObjectMapper();
// Write JSON formated data to output.json file
mapper.writerWithDefaultPrettyPrinter().writeValue(output, readData);
// Write JSON formated data to stdout
System.out.println(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(readAll));
}
}
It seems like there should be a better way to achieve this.
Upvotes: 0
Reputation: 306
You can use the String replaceAll() method to replace all new lines with spaces.
String str = mapper.writerWithDefaultPrettyPrinter().writeValueAsString(readAll);
str = str.trim().replaceAll("[\n\s]+", " ");
Upvotes: 1