LamaMo
LamaMo

Reputation: 626

Dealing with variable inside awk result division by zero

I'm writing a simple shell command using awk, as follow:

input_folder='/home/Desktop/files'
results_folder='/home/results'

for entry in $input_folder/*
do

re=$(samtools view -H $entry | grep -P '^@SQ' | cut -f 3 -d ':' | awk '{sum+=$1} END {print sum}')

echo -e "$(samtools depth $entry | awk '{sum+=$3} END { print $(sum/$re)}')\t/$entry" >> $results_folder/Results.txt

done

the result in variable re is a number but using the result of re into the second command print $(sum/$re)}' give me this error awk: cmd. line:1: (FILENAME=- FNR=312843568) fatal: division by zero attempted

I tried not to put $ with the variable but also the same error.

Any help with that please?

Upvotes: 2

Views: 778

Answers (4)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2809

In my own library, I conjured up this chained assignment code that generates 4 global constants, each being one of the 4 floating point edge case values.

Whether the function was called with any parameter value is moot, since the first step is cleansing it to numeric zero, before traversing the nested assigns :

0^-1 =>  +inf
       -(+inf) => -inf
                 (-inf) - (-inf) => +nan
                                  -(+nan) => -nan

function init_nan_inf(_) {

    return CONST_NEG_NAN = -(CONST_POS_NAN = _ -=  (_ = \
           CONST_NEG_INF = -(CONST_POS_INF = _ ^= -(_ = _ < _)^_)))
}

This chain is POSIX-compliant, works on all awk variants I've tested, doesn't require calling either log(0) or sqrt(-1), and certainly doesn't trigger any annoying error messages.

Checking for inf / nan is a different matter entirely. Because on nawk's non-compliant behavior when it comes to NaNs, I had to stitch together these strange functions so any awk can correctly identify them :

function is_inf(_) {

    return (_ == CONST_POS_INF) != (CONST_NEG_INF == _)
}
function is_inf_neg(_) {

    return (_ == CONST_POS_INF) < (CONST_NEG_INF == _)
}
function is_inf_pos(_) {

    return (_ == CONST_NEG_INF) < (CONST_POS_INF == _)
}
function is_inf_or_nan(_) {

    return (_ != _) + (CONST_POS_INF == _) ||
                (_ ==  CONST_NEG_INF)
}
function is_nan(_) {

    return ! FLG_ANY_GAWK ? (CONST_POS_INF  == _) * \
                       (_ == CONST_NEG_INF) : (_ != _)
}

It simply leverages the 4 global constants created by the function above.

As for FLG_ANY_GAWK, it could be generated via

FLG_ANY_GAWK = (sprintf("%c", -1) < sprintf("%c", 1)) 

It works because the gnu has decided that no negative inputs are considered valid when it comes to "%c" format in s/printf(), so it would consistently spit out the null byte \000 | 0x00 regardless of whether you entered -1 or -9007199254740991. Every other normally functioning awk variant will give you \377 | 0xFF, and yielding FALSE for that string less-than (<) comparison.

The expression is subversion-proof since it directly measures and gauges intrinsic behavior of awks rather than relying on any parameter or value in ARGV[], ENVIRON[], or somewhere in the shell. At the same time, awks have no constructs to enable operator overloading at all, so < always means the same thing.


Despite being a 47yo language, assignments in awk, like many modern counterparts such as Rust, are expressions, so chaining them is idiomatic. Dedicated "assignment expressions" syntax like := add complexity to the language spec without contributing meaningful benefits that weren't already present in expression-based ones.

Upvotes: 0

Tyl
Tyl

Reputation: 5252

Change the awk part to:

awk -v re="$re" '{sum+=$3} END { if(re) print sum/re; else print "oo";}'

You have to use -v to transfer the variable into awk.
And also it's better to check if re is zero.
I used oo to represent Infinity symbol.

Upvotes: 2

Kubator
Kubator

Reputation: 1383

Inplacing bash variable into awk will do the job:

awk '{sum+=$3} END { print(sum/'${re}') }'

You'd better also check ${re} in bash for non zero check before passing to awk.

Upvotes: 0

RavinderSingh13
RavinderSingh13

Reputation: 133428

I am not clear why you are sending output of echo command to awk. YOur actual awk command should be to avoid your error (in which it tells that you are dividing it by zero). Try changing your awk program to following once?

awk -v re="$re" '{sum+=$3} END {if(re){print (sum/re)} else {print "Please check seems value of re is ZERO else you will get an error from awk program}}'

Upvotes: 2

Related Questions