madkitty
madkitty

Reputation: 1675

Absolute value in awk doesn't work?

I want to select line of a file where the absolute value of column 9 is less than 500. Column is sometimes positive, sometimes negative.

awk -F'\t' '{ if ($9 < |500|) {print $0} }' > output.bam

This doesn't work so far .. one round on internet told me that to use the absolute value we should add

func abs(x) { return (x<0) ? x*-1 : x }

Then how am I suppose to put this along with the value of column 9?? I don't know what could be a proper syntax..

Upvotes: 25

Views: 39366

Answers (6)

Chris Davies
Chris Davies

Reputation: 644

Here's another approach:

awk '
    # Calculate the sign of a number (-1, 0, 1)
    function sgn(n) {
        if (!n) { return 0 } else { return index(n, "-") ? -1 : 1 }
    }

    # Calculate the absolute value of a number
    function abs(n) {
        return sgn(n)*n
    }

    # Demonstrate value, sign, and absolute value
    {
        printf "%s\t%d\t%s\n", $1, sgn($1), abs($1)
    }
'

Example

printf '%s\n' 1.4 -1.6 | awk…

1.4     1       1.4
-1.6    -1      1.6

Upvotes: 0

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2809

hopefully this should account for as many scenarios as possible :

  {m,g}awk '
  function abs(_) { 
     return        \
     (""==_ || _==+_) \
        ?   (+_ <= -_  ? -_ :+_) \
     : _!~"[Ii][Nn][Ff]|[Nn][Aa][Nn]|[0-9]" \
        ?              _\
     : substr("",sub("^[ \t]*[-]?[+]*","+",_))_ 

  } BEGIN { 
       OFS  =      FS = "="
       OFMT = CONVFMT = "%+.1"(__=5)"g" 
        ___ = \
              ".<-.str.|.numeric.->." 
  }  {           
      $++NF = abs($!_) 
      $++NF = ___
        $__ =    +$!_
      $++NF = abs($__); print 

  } END {
      print _=sprintf("%.50g",log(_=_<_)), abs(_),___,
                            _=log(_<_),    abs(_)

      print _=sprintf("%.50g", -(-log(_=_<_)/-log(_))), 
                             abs(_),___,_=+_, abs(_)  }'


-49386.673343919203   +49386.6733439192  .<-.str.|.numeric.->.  -49386.6733439192  +49386.6733439192
-37041.047348385706   +37041.0473483857  .<-.str.|.numeric.->.  -37041.0473483857  +37041.0473483857
-24695.421352852205   +24695.4213528522  .<-.str.|.numeric.->.  -24695.4213528522  +24695.4213528522
-12349.795357318704   +12349.7953573187  .<-.str.|.numeric.->.  -12349.7953573187  +12349.7953573187
-4.169361785203       +4.169361785203    .<-.str.|.numeric.->.  -4.169361785203    +4.169361785203
-inf                  +inf               .<-.str.|.numeric.->.  -inf               +inf
-nan                  +nan               .<-.str.|.numeric.->.  -nan               +nan
12341.456633748297    +12341.4566337483  .<-.str.|.numeric.->.  +12341.4566337483  +12341.4566337483
24687.082629281798    +24687.0826292818  .<-.str.|.numeric.->.  +24687.0826292818  +24687.0826292818
37032.708624815299    +37032.7086248153  .<-.str.|.numeric.->.  +37032.7086248153  +37032.7086248153

Upvotes: -2

Roman Kogan
Roman Kogan

Reputation: 389

There is a loss of precision using sqrt($9^2). That might be a problem if you want to print the absolute value as well.

Solution: process as text, and simply remove the leading minus sign, if present.

This guarantees that the output matches the input exactly.

Code:

awk '{sub("^-", "", $9); if ($9 < 500) print $9}' inputfile

Summary: to get absolute value using awk, simply remove the leading minus (-) character from a field, if present.

Upvotes: 1

champost
champost

Reputation: 523

Is this too obvious and/or not elegant ?

awk -F'\t' '$9 < 500 && $9 > -500' > output.bam

Upvotes: 3

TheAmigo
TheAmigo

Reputation: 1072

For quick one-liners, I use this approach:

awk -F'\t' 'sqrt($9*$9) < 500' > output.bam

It's quick to type, but for large jobs, I'd imagine that sqrt() would impose a performance hit.

Upvotes: 20

Kane
Kane

Reputation: 908

awk -F'\t' 'function abs(x){return ((x < 0.0) ? -x : x)} {if (abs($9) < 500) print $0}'

Upvotes: 38

Related Questions