Linux; sort data and print only highest value of one column

Question

With linux, I try to sort my data based on the 1st and 2nd column and print only the line with the highest value in the 3th column. My data looks like

A 1 75.0
A 1 99.0
A 2 68.0
B 1 66.0
B 1 50.0
B 2 75.0
B 2 80.0

The "keys" are in column 1 and 2, and than I want to print only the line that have the highest value in the 3th column, when the data of column 1 and 2 is equal, like this

A 1 99.0
A 2 68.0
B 1 66.0
B 2 80.0

I tried to do it with sort, sort -k1,1 -k2,2 , but how can I change the command that it only prints the line with the highest value in column 3?

Inian · Accepted Answer

You could just a single Awk solution for this, instead of clubbing multiple option flags in sort

awk 'unique[$1FS$2]<$3{unique[$1FS$2]=$3; next}END{for (i in unique) print i,unique[i]}' file

The idea is that we create hash-table with key as first two columns, table name is unique and key is $1FS$2 and then we add the largest value to the table per unique key and once all the lies are printed we print the hash table in the END clause.

Linux; sort data and print only highest value of one column

Answers (1)

Related Questions