Reputation: 37
my input file is like this.
01,A,34 01,A,35 01,A,36 01,A,37 02,A,40 02,A,41 02,A,42 02,A,45
my output needs to be
01,A,37 01,A,36 01,A,35 02,A,45 02,A,42 02,A,41
i.e select only top three records (top value based on 3rd column) based on key(1st and 2nd column)
Thanks in advance...
Upvotes: 1
Views: 559
Reputation: 28000
With Perl:
perl -F, -lane'
push @{$_{join ",", @F[0,1]}}, $F[2];
END {
for $k (keys %_) {
print join ",", $k, $_
for (sort { $b <=> $a } @{$_{$k}})[0..2]
}
}' infile
Upvotes: 0
Reputation: 881323
You can use a simple bash
script to do this provided the data is as shown.
pax$ cat infile
01,A,34
01,A,35
01,A,36
01,A,37
02,A,40
02,A,41
02,A,42
02,A,45
pax$ ./go.sh
01,A,37
01,A,36
01,A,35
02,A,45
02,A,42
02,A,41
pax$ cat go.sh
keys=$(sed 's/,[^,]*$/,/' infile | sort -u)
for key in ${keys} ; do
grep "^${key}" infile | sort -r | head -3
done
The first line gets the full set of keys, constructed from the first two fields by removing the final column with sed
then sorting the output and removing duplicates with sort
. In this particular case, the keys are 01,A,
and 02,A,
.
It the extracts the relevant data for each key (the for
loop in conjunction with grep
), sorting in descending order with sort -r
, and getting only the first three (for each key) with head
.
Now, if your key is likely to contain characters special to grep
such as .
or [
, you'll need to watch out.
Upvotes: 2