cartman619
cartman619

Reputation: 552

awk script for counting records based on multiple comditions

I have a file with 3 columns as shown below

col1,col2
a,x,1
b,y,2
a,x,0
b,x,2
b,y,0
a,y,0

I am working on an awk script to get following output : (grouped by col1 and col2 and counts of total, condition1, condition2)

col1,col2,total count,count where col3=0, count where col3>0
a,x,2,1,1
a,y,1,1,0
b,x,1,0,1
b,y,2,1,1

I worked out a script to get all 3 separately by using following command :

for case 3 : col3>0
awk -F',' '($3>0)NR>1{arr[$1","$2]++}END{for (a in arr) print a, arr[a]}' file 

Similar command for other case as well.

I am unable to create a command/script to solve all 3 case in same go.

Any help is appreciated.

P.S.: This sample file is small so I can run 3 script/command and join them, but for real file is too big to run same thin 3 times.

Upvotes: 0

Views: 99

Answers (1)

James Brown
James Brown

Reputation: 37404

Here's one:

$ awk '
BEGIN {                        
    FS=OFS=","                 # field separators
}
NR>1 {                         # after header
    k=$1 OFS $2                # set the key
    a[k]++                     # total count of unique $1 $2
    b[k]+=($3==0)              # count where $3==0
    c[k]+=($3>0)               # count where $3>0
}
END {                          # after all processing is done
    for(i in a)                # output values
        print i,a[i],b[i],c[i]
}' file

Output (in random order but you can fix that with @Inian's tip in the comments):

a,y,1,1,0
b,x,1,0,1
b,y,2,1,1
a,x,2,1,1

Upvotes: 2

Related Questions