Reputation: 362
I found some questions about this, but none of them really answered to my question.
I have a tabulated file like this:
2 10610 0 0 0 0.0105292
2 10649 0 0 0 0.041959
2 10682 0 0 0 0.0449746
2 10705 0 0 0 0.0441639
2 10797 2 0 0 0.0342728
2 10955 0 0 0 0.0136986
2 10957 0 0 0 0.0135135
2 11124 0 0 0 0.0583367
2 11336 1 0 0 0.0219502
and I used this command:
awk '{if ($6 > 0.4) print $6}' myfile
And here is the output:
0.0105292
0.041959
0.0449746
0.0441639
0.0342728
0.0136986
0.0135135
0.0583367
0.0219502
It's returning all the value for the 6th column. Here i should get no results since the condition is not respected. So I guess awk is not considering $6 as a float.
I tried other syntax but I still have the same problem.
I also tried the command on the first column and there it's working...
ps: I'm on MacOSX
Edit: Though it's working when I use awk '{print $6}'
Upvotes: 4
Views: 4145
Reputation: 203995
It's your locale setting (see https://www.gnu.org/software/gawk/manual/gawk.html#Locales and specifically https://www.gnu.org/software/gawk/manual/gawk.html#Locale-influences-conversions), explicitly setting LC_ALL=C is one way to solve the problem:
LC_ALL=C awk '{if ($6 > 0.4) print $6}' myfile
What's happening is that you're trying to use a decimal point of .
but your locale (typical in most European countries and many others) uses ,
instead. So when your input contains:
0.0105292
awk does not recognize it as looking like a number in your locale, so instead it gets treated as a string. If your input was instead:
0,0105292
THEN awk would recognize it as a number (so this is the other way to solve your problem - use commas as the decimal point in your input).
So to awk your code:
$6 > 0.4
is a string "0.0105292"
being compared to a number 0.4
(per POSIX the .
is always the decimal point when used in the code) and per this comparison table from the gawk manual:
+----------------------------------------------
| STRING NUMERIC STRNUM
--------+----------------------------------------------
|
STRING | string string string
|
NUMERIC | string numeric numeric
|
STRNUM | string numeric numeric
--------+----------------------------------------------
we see that the type of comparison performed when a string is compared to a number (or anything else) is a string comparison.
So in your original code the string "0.0105292"
is being string-compared with the number 0.4
and awk is apparently deciding that the former is greater than the latter (idk why, maybe some other locale effect).
Upvotes: 11