lre1234
lre1234

Reputation: 57

Counting and printing occurrences in a file

I have a file that looks like this:

cond1 20
cond1 10
cond1 5
cond2 12
cond3 10
cond3 9
cond3 1
cond4 2
cond5 10
cond5 8

I'm trying to sort the file by the first column, then the second column, then add a third column with the count for the rank of the strings in the first two columns. It would look like this:

cond1 20 1
cond1 10 2
cond1 5  3
cond2 12 1
cond3 10 1
cond3 9  2
cond3 1  3
cond4 2  1
cond5 10 1
cond5 8  2

I know that there is some awk or sed command that can do this, but I can't seem to figure it out. uniq -c doesn't do what I am looking for. Any advice would be appreciated.

Upvotes: 1

Views: 75

Answers (2)

Akshay Hegde
Akshay Hegde

Reputation: 16997

Using sort and awk, after sorting just reset variable n whenever awk finds new word in column1 ( without using array )

$ sort -k1,1 -k2,2nr file | awk '$1!=p{n=0; p=$1}{print $0,++n}'

Input

$ cat f
cond1 20
cond1 10
cond1 5
cond2 12
cond3 10
cond3 9
cond3 1
cond4 2
cond5 10
cond5 8

Output

$ sort -k1,1 -k2,2nr f | awk '$1!=p{n=0; p=$1}{print $0,++n}' 
cond1 20 1
cond1 10 2
cond1 5 3
cond2 12 1
cond3 10 1
cond3 9 2
cond3 1 3
cond4 2 1
cond5 10 1
cond5 8 2

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 204628

$ awk '{print $0, ++rank[$1]}' file
cond1 20 1
cond1 10 2
cond1 5 3
cond2 12 1
cond3 10 1
cond3 9 2
cond3 1 3
cond4 2 1
cond5 10 1
cond5 8 2

If your original input file wasn't already sorted then prepend a call to sort:

$ sort -k1,1 -k2,2nr file | awk '{print $0, ++rank[$1]}'
cond1 20 1
cond1 10 2
cond1 5 3
cond2 12 1
cond3 10 1
cond3 9 2
cond3 1 3
cond4 2 1
cond5 10 1
cond5 8 2

and if you want the spacing lined up visually then append a call to column:

$ awk '{print $0, ++rank[$1]}' file | column -t
cond1  20  1
cond1  10  2
cond1  5   3
cond2  12  1
cond3  10  1
cond3  9   2
cond3  1   3
cond4  2   1
cond5  10  1
cond5  8   2

Mix and match to taste....

Upvotes: 2

Related Questions