Reputation: 545
I have a tab delimited file as such:
1 68082 68082 0.003 0.0984 0.0845750981305074 1
1 69428 69428 0.0015 0.0497 0.04367900961171486 1
1 69761 69761 0.0045 0.1034 0.09005130799195755 1
1 69899 69899 0.0106 0.001 0.012825357055808327 1
1 70352 70352 0.0356 0.002 0.04128979333631639 1
1 136113 136113 0.0015 0.0278 0.02540996544374495 1
1 138396 138396 0.0008 0.0089 0.008567211104293392 1
1 872352 872352 0.4955 0.2803 0.48119634372979975 1
1 872467 872467 0.0121 0.004 0.01705890110859077 1
1 872564 872564 0.0015 0.002 0.0034277132094182 1
I would like to get the sum of column 6:
awk -F'[\t]' '{ total += $6 } END { print total }' file
And the sum of column 7:
awk -F'[\t]' '{ total += $7 } END { print total }' file
And then divide the sum of column 6 by the sum of column 7.
Is it possible to do this entirely in one command? Something like process substitution in bash?
Upvotes: 1
Views: 1313
Reputation: 133650
Following awk
may help you in same if we need to print only the 6th and 7th fields sum.
awk '{sum6+=$6;sum7+=$7} END{print "Sum of column 6:",sum6 RS "sum of column 7:",sum7;print "Divide of sum6 and sum7",sum6/sum7}' Input_file
Output will be as follows.
Sum of column 6: 0.808081
sum of column 7: 10
Divide of sum6 and sum7 0.0808081
EDIT: Solution2: Adding a non-one liner form of solution along with Sir Ed Morton's suggestion which includes logic in case sum of 7th column is 0
then it will not throw error in output.
awk '{
sum6+=$6;
sum7+=$7
}
END{
print "Sum of column 6:",sum6 ORS "sum of column 7:",sum7;print "Divide of sum6 and sum7",sum7?sum6/sum7:"you are trying to divide with a 0, seems sum7 is zero."
}' Input_file
Upvotes: 3