Reputation: 877
I am beginner to AWK. I just learned built in variable and for loop. And from tutorials i came to know that NF means Number of Fields in a record.. I used it to count number of field in each record of database as below.
Database name : student
Annie 101 56 89
Joy 102 78 56
rinken 103 45
and perform following code on it
awk '{print NR,"->",NF}' student
so it gives output as below..
1 -> 4
2 -> 4
3 -> 3
so it is clear that it works on field.. But in below database it works differently Database name : practice
1
2
3
4
5
i performed following command on it to sum up all data
awk '{for(i=1;i<=NF;i++) total=total+$i}; END {print total} ' practice
output :
15
so how the NF works in second example. So my question is that how it calculates the output because there is only one field. so how many times the loop will works?
Upvotes: 1
Views: 12188
Reputation: 58617
How the loop works even if there is one field is that it is an (almost) correct piece of logic that works in the case when there fields, one field, or two or more fields.
The loop variable i
is initialized to 1. The loop guard expression is i <= NF
.
Thus, firstly, if there are no fields (NF == 0
), the loop will not execute, and nothing will be added to total
.
If there is one field, then the expression i<=NF
is equivalent to i<=1
, which is true, so the body is executed. Inside the loop body, the value of the expression $i
is added to total
. $
is Awk's special operator for indexing over fields. Since i
is 1, $i
is equivalent to $1
: a reference to field 1, so field 1 is added to the total. Then i
is incremented by the i++
, becoming 2. The i<=NF
loop guard fails and so the loop doesn't execute any more.
Loops can execute zero times, or just once.
One issue with the code is that if there are no fields anywhere in the file at all, the program will not produce 0. The total
variable remains undefined and when referenced in the print
in the END
block, it produces a blank.
Upvotes: 2
Reputation: 4112
awk reads each file sequentially;
user@host:/tmp/test$ awk '{for(i=1;i<=NF;i++) total=total+$i}; {printf "NumberOfRow : %s \tNumberOfField : %s \ttotal : %s\n" , NR, NF, total } END {print "total sum : " total} ' practice
NumberOfRow : 1 NumberOfField : 1 total : 1
NumberOfRow : 2 NumberOfField : 1 total : 3
NumberOfRow : 3 NumberOfField : 1 total : 6
NumberOfRow : 4 NumberOfField : 1 total : 10
NumberOfRow : 5 NumberOfField : 1 total : 15
total sum : 15
Upvotes: 1
Reputation: 85800
awk
runs sequentially on each line, so for each line it parses, it loops through to the total number of columns which are all 1
in each of the row.
So ideally it parses like,
awk '{for(i=1;i<=NF;i++) total=total+$i}; END {print total} ' practice
# value of i=1,NF=1 total=0+1 (1) NR=1
# value of i=1,NF=1 total=1+2 (3) NR=2
# value of i=1,NF=1 total=3+3 (6) NR=3
# value of i=1,NF=1 total=6+4 (10) NR=4
# value of i=1,NF=1 total=10+5 (15) NR=5
And to sum up all the column 1
entries together, you can just do
awk '{ sum += $1 } END { print sum }'
Upvotes: 5
Reputation: 37424
That for
loop is useless in that script as there is only one field in each record plus the output is not in any way related to the NF
. You could write that script like this:
awk '{total=total+$1} END {print total}' practice
15
Upvotes: 1