Adam Siemion
Adam Siemion

Reputation: 16039

Sort and select lines with the maximum value

I have a directory with files matching name pattern A-B.diff, where A and B are numbers, e.g.:

100885-40843.diff
100885-41535.diff
100886-40500.diff
101036-41762.diff
101036-42346.diff
101038-42010.diff
101038-42127.diff
101038-43258.diff
101038-43873.diff

I would like to get a list of these files matching below criteria:

So for the given files the list should be:

100885-41535.diff
100886-40500.diff
101036-42346.diff
101038-43873.diff

Upvotes: 0

Views: 133

Answers (2)

Miles Yucht
Miles Yucht

Reputation: 564

One way using sort and uniq is

sort -t- -r -k2 | sort -t- -rs | uniq -w6

-t- sets the separator to be the minus sign, -k2 selects the second column to be sorted, -r sorts from greatest to least, and -s forces the sort to be stable. Then, uniq -w6 is the same as uniq (select unique lines from a sorted list) except that it only checks the first six characters. Once you execute this command, the lines are sorted by A and by B, and for each A its first occurrence has the maximal B, so uniq selects that line. On your input, this displays the output

101038-43873.diff
101036-42346.diff
100886-40500.diff
100885-41535.diff

I suppose if you want the list in its natural order you can append another sort -n (sort by number).

Upvotes: 3

fedorqui
fedorqui

Reputation: 289745

If you store data in a file, this makes it:

$ awk -F"[-.]" '{if ($2 > a[$1]) a[$1]=$2} END{for (i in a) printf "%s-%s.diff\n",i, a[i]}' file
100885-41535.diff
100886-40500.diff
101036-42346.diff
101038-43873.diff

Loops through the list of files creating an array with a[1st part] = biggest 2nd part.

Upvotes: 1

Related Questions