Reputation: 1675
I want to select line of a file where the absolute value of column 9 is less than 500. Column is sometimes positive, sometimes negative.
awk -F'\t' '{ if ($9 < |500|) {print $0} }' > output.bam
This doesn't work so far .. one round on internet told me that to use the absolute value we should add
func abs(x) { return (x<0) ? x*-1 : x }
Then how am I suppose to put this along with the value of column 9?? I don't know what could be a proper syntax..
Upvotes: 25
Views: 39366
Reputation: 644
Here's another approach:
awk '
# Calculate the sign of a number (-1, 0, 1)
function sgn(n) {
if (!n) { return 0 } else { return index(n, "-") ? -1 : 1 }
}
# Calculate the absolute value of a number
function abs(n) {
return sgn(n)*n
}
# Demonstrate value, sign, and absolute value
{
printf "%s\t%d\t%s\n", $1, sgn($1), abs($1)
}
'
Example
printf '%s\n' 1.4 -1.6 | awk…
1.4 1 1.4
-1.6 -1 1.6
Upvotes: 0
Reputation: 2809
hopefully this should account for as many scenarios as possible :
{m,g}awk '
function abs(_) {
return \
(""==_ || _==+_) \
? (+_ <= -_ ? -_ :+_) \
: _!~"[Ii][Nn][Ff]|[Nn][Aa][Nn]|[0-9]" \
? _\
: substr("",sub("^[ \t]*[-]?[+]*","+",_))_
} BEGIN {
OFS = FS = "="
OFMT = CONVFMT = "%+.1"(__=5)"g"
___ = \
".<-.str.|.numeric.->."
} {
$++NF = abs($!_)
$++NF = ___
$__ = +$!_
$++NF = abs($__); print
} END {
print _=sprintf("%.50g",log(_=_<_)), abs(_),___,
_=log(_<_), abs(_)
print _=sprintf("%.50g", -(-log(_=_<_)/-log(_))),
abs(_),___,_=+_, abs(_) }'
-49386.673343919203 +49386.6733439192 .<-.str.|.numeric.->. -49386.6733439192 +49386.6733439192
-37041.047348385706 +37041.0473483857 .<-.str.|.numeric.->. -37041.0473483857 +37041.0473483857
-24695.421352852205 +24695.4213528522 .<-.str.|.numeric.->. -24695.4213528522 +24695.4213528522
-12349.795357318704 +12349.7953573187 .<-.str.|.numeric.->. -12349.7953573187 +12349.7953573187
-4.169361785203 +4.169361785203 .<-.str.|.numeric.->. -4.169361785203 +4.169361785203
-inf +inf .<-.str.|.numeric.->. -inf +inf
-nan +nan .<-.str.|.numeric.->. -nan +nan
12341.456633748297 +12341.4566337483 .<-.str.|.numeric.->. +12341.4566337483 +12341.4566337483
24687.082629281798 +24687.0826292818 .<-.str.|.numeric.->. +24687.0826292818 +24687.0826292818
37032.708624815299 +37032.7086248153 .<-.str.|.numeric.->. +37032.7086248153 +37032.7086248153
Upvotes: -2
Reputation: 389
There is a loss of precision using sqrt($9^2). That might be a problem if you want to print the absolute value as well.
Solution: process as text, and simply remove the leading minus sign, if present.
This guarantees that the output matches the input exactly.
Code:
awk '{sub("^-", "", $9); if ($9 < 500) print $9}' inputfile
Summary: to get absolute value using awk, simply remove the leading minus (-) character from a field, if present.
Upvotes: 1
Reputation: 523
Is this too obvious and/or not elegant ?
awk -F'\t' '$9 < 500 && $9 > -500' > output.bam
Upvotes: 3
Reputation: 1072
For quick one-liners, I use this approach:
awk -F'\t' 'sqrt($9*$9) < 500' > output.bam
It's quick to type, but for large jobs, I'd imagine that sqrt() would impose a performance hit.
Upvotes: 20
Reputation: 908
awk -F'\t' 'function abs(x){return ((x < 0.0) ? -x : x)} {if (abs($9) < 500) print $0}'
Upvotes: 38