Srikanth Kandalam
Srikanth Kandalam

Reputation: 1045

How to sort data in a CSV file using a particular field in Java?

I want to read a CSV file in Java and sort it using a particular column. My CSV file looks like this:

 ABC,DEF,11,GHI....
 JKL,MNO,10,PQR....
 STU,VWX,12,XYZ....

Considering I want to sort it using the third column, my output should look like:

 JKL,MNO,10,PQR....
 ABC,DEF,11,GHI....
 STU,VWX,12,XYZ....

After some research on what data structure to use to hold the data of CSV, people here suggested to use Map data structure with Integer and List as key and value pairs in this question:

 Map<Integer, List<String>>
 where the value, List<String> = {[ABC,DEF,11,GHI....], [JKL,MNO,10,PQR....],[STU,VWX,12,XYZ....]...}
 And the key will be an auto-incremented integer starting from 0.

So could anyone please suggest a way to sort this Map using an element in the 'List' in Java? Also if you think this choice of data structure is bad, please feel free to suggest an easier data structure to do this.

Thank you.

Upvotes: 4

Views: 26954

Answers (4)

Peter Lawrey
Peter Lawrey

Reputation: 533530

In Java 8 you can do

SortedMap<Integer, List<String>> collect = Files.lines(Paths.get(filename))
    .collect(Collectors.groupingBy(
                                l -> Integer.valueOf(l.split(",", 4)[2]), 
                                TreeMap::new, Collectors.toList()));

Note: comparing numbers as Strings is a bad idea as "100" < "2" might not be what you expect.

I would use a sorted multi-map. If you don't have one handy you can do this.

SortedMap<Integer, List<String>> linesByKey = new TreeMap<>();

public void addLine(String line) {
    Integer key = Integer.valueOf(line.split(",", 4));
    List<String> lines = linesByKey.get(key);
    if (lines == null)
         linesByKey.put(key, lines = new ArrayList<>());
    lines.add(line);
}

This will produce a collection of lines, sorted by the number where lines with duplicate numbers have a preserved order. e.g. if all the lines have the same number, the order is unchanged.

Upvotes: 4

Anuj
Anuj

Reputation: 1

In the below code I have sorted the CSV file based on the second column.


public static void main(String[] args) throws IOException {
    String csvFile = "file_1.csv";
    String line = "";
    String cvsSplitBy = ",";
    List<List<String>> llp = new ArrayList<>();
    try (BufferedReader br = new BufferedReader(new FileReader(csvFile))) {
        while ((line = br.readLine()) != null) {
            llp.add(Arrays.asList(line.split(cvsSplitBy)));
        }
        llp.sort(new Comparator<List<String>>() {
            @Override
            public int compare(List<String> o1, List<String> o2) {
                return o1.get(1).compareTo(o2.get(1));
            }
        });
        System.out.println(llp);

    } catch (IOException e) {
        e.printStackTrace();
    }
}

Upvotes: 0

AlexWien
AlexWien

Reputation: 28727

I would use an ArrayList of ArrayList of String:

ArrayList<ArrayList<String>>

Each entry is one line, which is a list of strings. You initialize the list by:

List<ArrayList<String>> csvLines = new ArrayList<ArrayList<String>>();

To get the nth line:

List<String> line = csvLines.get(n);

To sort you write a custom Comparator. In the Constructor of that comparator you can pass the field position used to sort.

The compare method then gets the String value on stored position and converts it to a primitive ava type depending on the position. E.g you know that at position 2 in the csv there is an Integer, then convert the String to an int. This is neccessary for corretcly sorting. You may also pass an ArrayList of Class to the constructor such that it knows which field is what type.
Then use String.compareTo() or Integer.compare(), depending on column position etc.

Edit example of working code:

List<ArrayList<String>> csvLines = new ArrayList<ArrayList<String>>();
Comparator<ArrayList<String>> comp = new Comparator<ArrayList<String>>() {
    public int compare(ArrayList<String> csvLine1, ArrayList<String> csvLine2) {
        // TODO here convert to Integer depending on field.
        // example is for numeric field 2
        return Integer.valueOf(csvLine1.get(2)).compareTo(Integer.valueOf(csvLine2.get(2)));
    }
};
Collections.sort(csvLines, comp);

Upvotes: 4

ltalhouarne
ltalhouarne

Reputation: 4636

You can also use a list of lists:

List<List<String>> Llp = new ArrayList<List<String>>();

Then you need to call sort that extends a custom comparator that compares the third item in the list:

    Collections.sort(Llp, new Comparator<LinkedList<String>>() {
            @Override
            public int compare(LinkedList<String> o1, LinkedList<String> o2) {
                try {                      
                    return o1.get(2).compareTo(o2.get(2));
                } catch (IndexOutOfBoundsException e) {
                    return 0;
                }
 }

Upvotes: 0

Related Questions