Dip
Dip

Reputation: 877

awk built in variable NF

I am beginner to AWK. I just learned built in variable and for loop. And from tutorials i came to know that NF means Number of Fields in a record.. I used it to count number of field in each record of database as below.

Database name : student

Annie      101   56   89
Joy        102   78   56
rinken     103   45

and perform following code on it

awk '{print NR,"->",NF}' student

so it gives output as below..

1 -> 4
2 -> 4
3 -> 3

so it is clear that it works on field.. But in below database it works differently Database name : practice

1
2
3
4
5

i performed following command on it to sum up all data

awk '{for(i=1;i<=NF;i++) total=total+$i}; END {print total} ' practice

output :

15

so how the NF works in second example. So my question is that how it calculates the output because there is only one field. so how many times the loop will works?

Upvotes: 1

Views: 12188

Answers (4)

Kaz
Kaz

Reputation: 58617

How the loop works even if there is one field is that it is an (almost) correct piece of logic that works in the case when there fields, one field, or two or more fields.

The loop variable i is initialized to 1. The loop guard expression is i <= NF.

Thus, firstly, if there are no fields (NF == 0), the loop will not execute, and nothing will be added to total.

If there is one field, then the expression i<=NF is equivalent to i<=1, which is true, so the body is executed. Inside the loop body, the value of the expression $i is added to total. $ is Awk's special operator for indexing over fields. Since i is 1, $i is equivalent to $1: a reference to field 1, so field 1 is added to the total. Then i is incremented by the i++, becoming 2. The i<=NF loop guard fails and so the loop doesn't execute any more.

Loops can execute zero times, or just once.

One issue with the code is that if there are no fields anywhere in the file at all, the program will not produce 0. The total variable remains undefined and when referenced in the print in the END block, it produces a blank.

Upvotes: 2

Mustafa DOGRU
Mustafa DOGRU

Reputation: 4112

awk reads each file sequentially;

user@host:/tmp/test$ awk '{for(i=1;i<=NF;i++) total=total+$i}; {printf "NumberOfRow : %s \tNumberOfField : %s \ttotal : %s\n"  , NR, NF, total } END {print "total sum : " total} '  practice 
NumberOfRow : 1     NumberOfField : 1   total : 1
NumberOfRow : 2     NumberOfField : 1   total : 3
NumberOfRow : 3     NumberOfField : 1   total : 6
NumberOfRow : 4     NumberOfField : 1   total : 10
NumberOfRow : 5     NumberOfField : 1   total : 15
total sum : 15

Upvotes: 1

Inian
Inian

Reputation: 85800

awk runs sequentially on each line, so for each line it parses, it loops through to the total number of columns which are all 1 in each of the row.

So ideally it parses like,

awk '{for(i=1;i<=NF;i++) total=total+$i}; END {print total} ' practice
#  value of  i=1,NF=1    total=0+1  (1)         NR=1
#  value of  i=1,NF=1    total=1+2  (3)         NR=2
#  value of  i=1,NF=1    total=3+3  (6)         NR=3
#  value of  i=1,NF=1    total=6+4  (10)        NR=4
#  value of  i=1,NF=1    total=10+5 (15)        NR=5

And to sum up all the column 1 entries together, you can just do

 awk '{ sum += $1 } END { print sum }'

Upvotes: 5

James Brown
James Brown

Reputation: 37424

That for loop is useless in that script as there is only one field in each record plus the output is not in any way related to the NF. You could write that script like this:

awk '{total=total+$1} END {print total}' practice
15

Upvotes: 1

Related Questions