aln
aln

Reputation: 37

sort out selected records based on key in unix

my input file is like this.

01,A,34
01,A,35
01,A,36
01,A,37
02,A,40
02,A,41
02,A,42
02,A,45

my output needs to be

01,A,37
01,A,36
01,A,35
02,A,45
02,A,42
02,A,41

i.e select only top three records (top value based on 3rd column) based on key(1st and 2nd column)

Thanks in advance...

Upvotes: 1

Views: 559

Answers (2)

Dimitre Radoulov
Dimitre Radoulov

Reputation: 28000

With Perl:

perl -F, -lane'
  push @{$_{join ",", @F[0,1]}}, $F[2];
  END {
    for $k (keys %_) {
      print join ",", $k, $_
        for (sort { $b <=> $a } @{$_{$k}})[0..2]
      } 
    }' infile

Upvotes: 0

paxdiablo
paxdiablo

Reputation: 881323

You can use a simple bash script to do this provided the data is as shown.

pax$ cat infile
01,A,34
01,A,35
01,A,36
01,A,37
02,A,40
02,A,41
02,A,42
02,A,45

pax$ ./go.sh
01,A,37
01,A,36
01,A,35
02,A,45
02,A,42
02,A,41

pax$ cat go.sh
keys=$(sed 's/,[^,]*$/,/' infile | sort -u)
for key in ${keys} ; do
    grep "^${key}" infile | sort -r | head -3
done

The first line gets the full set of keys, constructed from the first two fields by removing the final column with sed then sorting the output and removing duplicates with sort. In this particular case, the keys are 01,A, and 02,A,.

It the extracts the relevant data for each key (the for loop in conjunction with grep), sorting in descending order with sort -r, and getting only the first three (for each key) with head.

Now, if your key is likely to contain characters special to grep such as . or [, you'll need to watch out.

Upvotes: 2

Related Questions