Reputation: 552
I have a file with 3 columns as shown below
col1,col2
a,x,1
b,y,2
a,x,0
b,x,2
b,y,0
a,y,0
I am working on an awk script to get following output : (grouped by col1 and col2 and counts of total, condition1, condition2)
col1,col2,total count,count where col3=0, count where col3>0
a,x,2,1,1
a,y,1,1,0
b,x,1,0,1
b,y,2,1,1
I worked out a script to get all 3 separately by using following command :
for case 3 : col3>0
awk -F',' '($3>0)NR>1{arr[$1","$2]++}END{for (a in arr) print a, arr[a]}' file
Similar command for other case as well.
I am unable to create a command/script to solve all 3 case in same go.
Any help is appreciated.
P.S.: This sample file is small so I can run 3 script/command and join them, but for real file is too big to run same thin 3 times.
Upvotes: 0
Views: 99
Reputation: 37404
Here's one:
$ awk '
BEGIN {
FS=OFS="," # field separators
}
NR>1 { # after header
k=$1 OFS $2 # set the key
a[k]++ # total count of unique $1 $2
b[k]+=($3==0) # count where $3==0
c[k]+=($3>0) # count where $3>0
}
END { # after all processing is done
for(i in a) # output values
print i,a[i],b[i],c[i]
}' file
Output (in random order but you can fix that with @Inian's tip in the comments):
a,y,1,1,0
b,x,1,0,1
b,y,2,1,1
a,x,2,1,1
Upvotes: 2