USB
USB

Reputation: 6139

How to find min and max value for each column in the entire csv file

How to find min and max value for each column (except albhabet values)in a csv file.

I want to get each columns min and max values

5.3,3.6,1.6,0.3,Iris-setosa
4.9,3.3,1.6,0.3,Iris-setosa
4.9,3.3,1.3,0.3,Iris-setosa
4.6,3.3,1.6,0.0,Iris-setosa

col 1, min = 4.6 ,max = 5.3
col 2, min = 3.3 ,max = 3.6
col 3, min = 1.3 ,max = 1.6
col 4, min = 0.0 ,max = 0.3

What I did is ,I iterated through each line and stored each column in a hashmap

{1=[5.3,4.9,4.9,4.6],2=[3.6,3.3,3.3,3.3],3[1.6,1.6,1.3,1.6],4[0.3,0.3,0.3,0.0]}

Then I calculated

for (Map.Entry<String, List<String>> entry : map.entrySet()) {      
// Iterating through values
String key = entry.getKey();
List<String> values = entry.getValue();
min = Double.parseDouble(Collections.min(values));
max = Double.parseDouble(Collections.max(values));
}

But when large data is coming it is not better to hold that much data in hashmap Then find the min and max How can I find min/max in other way.

Update

String line[] = value.split(delimit);
for(int i=0;i<line.length -1;i++){
 if (Double.parseDouble(line[i] ) < min) { 
   min = Double.parseDouble(line[i] );
  }
 if (Double.parseDouble(line[i] ) > max) {
  max = Double.parseDouble(line[i] );
  }
}

Not getting the expected result.

Solution :Calculating min and max of columns in a csv file

Upvotes: 0

Views: 7755

Answers (4)

Salah
Salah

Reputation: 8657

You could do this:

  • Read the file using a Stream.
  • Read data line by line.
  • split the columns.
  • create a method to calculate the max and the min.

so it could look like this:

    BufferedReader br = null;
    String line = "";
    String cvsSplitBy = ",";

    try {

        br = new BufferedReader(new FileReader(csvFile));
        while ((line = br.readLine()) != null) {

            // use comma as separator
            String[] columns= line.split(cvsSplitBy);

            calculateMinAndMax(columns);

        }

    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        if (br != null) {
            try {
                br.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

Then create a method to calculate min and max

private double[] maxValues = {0, 0, 0, 0};
private double[] minValues = {0, 0, 0, 0};
private void calculateMinAndMax(String[] line) {
    for (int i = 0; i < line.length; i++) {
            //check the max value
            double currentValue = Double.Double.parseDouble(line[i]);
            if(currentValue > maxValues[i] ) {
                maxValues[i] = currentValue;
            }

            //check the min value
            if(currentValue < minValues[i]) {
                minValues[i] = currentValue;
            }
    }
}

Upvotes: 1

Lesto
Lesto

Reputation: 2299

Why create an array/list/set when you can find the max/min for every cell WHILE REDING the line?

  1. read a line
  2. split it
  3. convert cells to double and check for min/max
  4. next line

with only one cicle you have your result. You can also store the results into array/list/set for other elaboration, but that is not necessary (and slow, as array/list/set will probably have to be resized many times if file size is not know at the beginning, also RAM size will be much buigger,all data vs just min/max variable for each cell)

Upvotes: 1

If you care about large set of data you should inline the process as much possible.

In your case you have a source that is divided in two items. A line and elements. You can use class Scanner

    Scanner lineScanner = new Scanner(source);

        while(lineScanner.hasNext()) {

            Scanner elementScanner = new Scanner(lineScanner.nextLine()).useDelimiter(",");

            for(int column = 1; elementScanner.hasNextDouble(); column++) {

                double nextDouble = elementScanner.nextDouble();

                updateMax(column, nextDouble); //or updateMinMax(column,nextDouble);
                updateMin(column, nextDouble);

            }

        }

    lineScanner.close();

Upvotes: 1

TheLostMind
TheLostMind

Reputation: 36304

  1. Split() each line based on ","
  2. from the array got after using split(), ignore/delete the last cell/index.
  3. Sort the array.
  4. In the sorted array, get min and max values.

put steps 1-4 in a loop until -"you have more lines in the file ". Happy Coding.

Upvotes: 1

Related Questions