gone
gone

Reputation: 1129

AWK Numeric Variable treated as string

[Ubuntu 14.04, GNU Awk 4.0.1]

I have a strange problem... I am assigning a numeric value, that is retrieved from an input file, to a custom variable. When I print it, it displays correctly, and printing its length displays the right number of digits.
However, when I use the variable in a loop, my loop stops when index becomes greater than the most significant digit of my variable.

I have tried a For Loop, and now a While Loop, both suffer the same problem.

With the file I'm processing, samples contains the value 8092, and the loop stops on the 9th iteration.

#!/usr/bin/awk -f
BEGIN {
  samples = 0;
}
{
  ...
  samples = $24;
}
END {
  i = 1;
  while (i <= samples ) {
    if (i>samples) { print "This is the end.\n " i " is bigger than " samples;}
    i++;
  }
}

I am very new to AWK, and can't see why this is occurring. After reading a number of tutorials, I'm under the impression that AWK is able to convert between string & numeric representations of numbers as required.

Can someone help me see what I've done wrong?

Solution The answer was, as JNevill & ghoti suggested, to add 0 to the variable. In my case, the best place was just before the loop, as samples` is rewritten during the body of the AWK script. Thanks.

Upvotes: 0

Views: 2748

Answers (2)

ghoti
ghoti

Reputation: 46836

Awk doesn't exactly "convert" between representations, it simply uses whatever you give it, adjusting context based on usage. Thus, when evaluating booleans, any non-zero number evaluates to TRUE, and any string except "0" evaluates to TRUE.

I can't see what's really in your samples variable, but if you want to force things to be evaluated as a number before you start your loop, you might be able to simple add zero to the variable I.e.:

samples = $24 + 0;

Also, if your source data came from a DOS/Windows machine and has line endings that include carriage returns (\r\n), and $24 is the last field on each line, then you may be comparing i against 24\r, which is likely not to give you the results you expect.

To see what's really in your input data, try:

cat -vet samples | less

If you see ^M before the $ at the end of each line, then your input file contains carriage returns, and you should process it appropriately before asking awk to parse its content.

In fact, I think it's pretty clear that since your input data begins with the character "8" and your loop stops on the 9th iteration, your comparison of i to samples is one of strings rather than numbers.

Upvotes: 2

JNevill
JNevill

Reputation: 50034

awk decides the type of variable depending on what value is held in the variable. You can force it to type the way you want, though it's a bit hackey (isn't everything though).

Try adding 0 to your variable before hitting the for loop. $sample = $sample + 0, for instance. Now no matter what awk thought before you hit that line, it will now treat your number as a number and your for loop should execute as expected.

Odd though that it was executing at all and stopping at 9 iterations.... It suggests that perhaps it is already treating it correctly and you may be assuming that the value is 8092, when it is, in fact 9. Also, that printed bit inside your for loop should never execute. Hopefully it doesn't output that.

Upvotes: 1

Related Questions