spiral01
spiral01

Reputation: 545

piping awk outputs into another awk command

I have a tab delimited file as such:

1   68082   68082   0.003   0.0984  0.0845750981305074  1
1   69428   69428   0.0015  0.0497  0.04367900961171486 1
1   69761   69761   0.0045  0.1034  0.09005130799195755 1
1   69899   69899   0.0106  0.001   0.012825357055808327    1
1   70352   70352   0.0356  0.002   0.04128979333631639 1
1   136113  136113  0.0015  0.0278  0.02540996544374495 1
1   138396  138396  0.0008  0.0089  0.008567211104293392    1
1   872352  872352  0.4955  0.2803  0.48119634372979975 1
1   872467  872467  0.0121  0.004   0.01705890110859077 1
1   872564  872564  0.0015  0.002   0.0034277132094182  1

I would like to get the sum of column 6:

awk -F'[\t]' '{ total += $6 } END { print total }' file

And the sum of column 7:

awk -F'[\t]' '{ total += $7 } END { print total }' file

And then divide the sum of column 6 by the sum of column 7.

Is it possible to do this entirely in one command? Something like process substitution in bash?

Upvotes: 1

Views: 1313

Answers (1)

RavinderSingh13
RavinderSingh13

Reputation: 133650

Following awk may help you in same if we need to print only the 6th and 7th fields sum.

awk '{sum6+=$6;sum7+=$7} END{print "Sum of column 6:",sum6 RS "sum of column 7:",sum7;print "Divide of sum6 and sum7",sum6/sum7}' Input_file

Output will be as follows.

Sum of column 6: 0.808081
sum of column 7: 10
Divide of sum6 and sum7 0.0808081

EDIT: Solution2: Adding a non-one liner form of solution along with Sir Ed Morton's suggestion which includes logic in case sum of 7th column is 0 then it will not throw error in output.

awk '{
  sum6+=$6;
  sum7+=$7
}
END{
  print "Sum of column 6:",sum6 ORS "sum of column 7:",sum7;print "Divide of sum6 and sum7",sum7?sum6/sum7:"you are trying to divide with a 0, seems sum7 is zero."
}'   Input_file

Upvotes: 3

Related Questions