Sort and select lines with the maximum value

Question

I have a directory with files matching name pattern A-B.diff, where A and B are numbers, e.g.:

100885-40843.diff
100885-41535.diff
100886-40500.diff
101036-41762.diff
101036-42346.diff
101038-42010.diff
101038-42127.diff
101038-43258.diff
101038-43873.diff

I would like to get a list of these files matching below criteria:

for every A there is only one file
B has the maximum value available for the given A

So for the given files the list should be:

100885-41535.diff
100886-40500.diff
101036-42346.diff
101038-43873.diff

fedorqui · Accepted Answer

If you store data in a file, this makes it:

$ awk -F"[-.]" '{if ($2 > a[$1]) a[$1]=$2} END{for (i in a) printf "%s-%s.diff
",i, a[i]}' file
100885-41535.diff
100886-40500.diff
101036-42346.diff
101038-43873.diff

Loops through the list of files creating an array with a[1st part] = biggest 2nd part.

Sort and select lines with the maximum value

Answers (2)

Related Questions