Joe
Joe

Reputation: 11

Spark Java - Convert json inside csv to map

I have CSV file with ~30 columns, one of the columns is a json string. What I want to do is to read the csv and breakdown the json to rows (explode).

for example: CSV:

"data1,date1,{"USERS-1":"ff", "name1":"Joe1", "age":"1"},1" 
"data2,date2,{"USERS-2":"ff", "name2":"Joe2", "age":"2"},2" 
"data3,date3,{"USERS-3":"ff", "name3":"Joe3", "age":"3"},3" 

Result after:

"data1,date1,"USERS-1","ff",1"
"data1,date1,"name1","Joe1",1"
"data1,date1,"age","1",1"
"data2,date2,"USERS-2","ff",2"
"data2,date2,"name2","Joe1",2"
"data2,date2,"age","2",2"
"data3,date3,"USERS-3","ff",3"
"data3,date3,"name3","Joe1",3"
"data3,date3,"age","3",3"

I'm not writing in scala.

The Json is unstructured!

Upvotes: 1

Views: 120

Answers (1)

Bîrsan Octav
Bîrsan Octav

Reputation: 69

Joe! I wrote a class that in order to show you how I would approach your problem. Following the code I will give you extra details in order for you to better understand what the code does.

public class MMM {

public static void main(String[] args) {
    String s = "data1,date1,{\"USERS-1\":\"ff\", \"name1\":\"Joe1\", \"age\":\"1\"},1";
    processLine(s);
}

public static void processLine(String s) {
    final String dates = s.split("[{]")[0];
    final String content = s.split("[{]")[1];
    final List<String> elements = Arrays.stream(content.split("[,}]")).map(String::trim).filter(x -> !x.isEmpty())
            .collect(Collectors.toList());
    String result = dates;
    for (int i = 0; i < elements.size() - 1; i++) {
        result += elements.get(i);
        result += elements.get(elements.size() - 1);
        System.out.println(result);
        result = dates;
    }
}
}

Basically, what the code does is to split a line read from the CSV into 2 parts, the dates and the contents found between the brackets. The contents are split again, trimmed in order to remove " " found at the ends of the strings and the the empty strings are filtered out. We now have a list of the elements concerning us. For a better visualisation of what the method does I decided to print the result. You can easily modify the code in order to have them returned in a list or whatever you might like. I hope my answer was helpful, have a nice day!

Upvotes: 2

Related Questions