Reputation: 465
With linux, I try to sort my data based on the 1st and 2nd column and print only the line with the highest value in the 3th column. My data looks like
A 1 75.0
A 1 99.0
A 2 68.0
B 1 66.0
B 1 50.0
B 2 75.0
B 2 80.0
The "keys" are in column 1 and 2, and than I want to print only the line that have the highest value in the 3th column, when the data of column 1 and 2 is equal, like this
A 1 99.0
A 2 68.0
B 1 66.0
B 2 80.0
I tried to do it with sort, sort -k1,1 -k2,2
, but how can I change the command that it only prints the line with the highest value in column 3?
Upvotes: 1
Views: 563
Reputation: 85560
You could just a single Awk
solution for this, instead of clubbing multiple option flags in sort
awk 'unique[$1FS$2]<$3{unique[$1FS$2]=$3; next}END{for (i in unique) print i,unique[i]}' file
The idea is that we create hash-table with key as first two columns, table name is unique
and key is $1FS$2
and then we add the largest value to the table per unique key and once all the lies are printed we print the hash table in the END
clause.
Upvotes: 1