Divide floats in awk

Question

I have written a code to calculate the zscore which calculates the mean and standard deviation from one file and uses some values from rows in another file, as follows:

 mean=$(awk '{total += $2; count++} END {print total/count}' ABC_avg.txt)
#calculating mean of the second column of the file
std=$(awk '{x[NR]=$2; s+=$2; n++} END{a=s/n; for (i in x){ss += (x[i]-a)^2} sd = sqrt(ss/n); print sd}' ABC_avg.txt)
#calculating standard deviation from the second column of the same file
awk '{if (std) print $2-$mean/$std}' ABC_splicedavg.txt" > ABC.tmp
#calculate the zscore for each row and store it in a temporary file
zscore=$(awk '{total += $0; count++} END {if (count) print total/count}' ABC.tmp)
#calculate an average of all the zscores in the rows and store it in a variable 
echo $motif"  "$zscore
rm ABC.tmp

However when I execute this code ,at the step where a temp file is created I get an error as fatal: division by zero attempted, what is the right way to implement this code? TIA I used bc -l option but it gives a very long version of the floating integer.

karakfa · Accepted Answer

Here is a script to compute mean and std in one pass, you may lose some resolution if not acceptable there are alternatives...

$ awk '{print rand()}' <(seq 100) 
  | awk '{sum+=$1; sqsum+=$1^2}
      END{print mean=sum/NR, std=sqrt(sqsum/NR-mean^2), z=mean/std}' 

0.486904 0.321789 1.51312

Your script for z-score for each sample is wrong! You need to do ($2-mean)/std.

Divide floats in awk

Answers (2)

Related Questions