Thierry Huysman
Thierry Huysman

Reputation: 43

Variable evaluation before assignment in awk

In the following awk statement:

awk '$2 > maxrate {maxrate = $2; maxemp = $1} 
     END {print "highest hourly rate:", maxrate, "for", maxemp}' pay.data

run on the following data:

Beth 4.00 0
Dan 3.75 0
Kathy 4.00 10
Mark 5.00 20
Mary 5.50 22
Susie 4.25 18

How does $2 > maxrate works since it is evaluated before its assignment to $2?

Upvotes: 4

Views: 169

Answers (2)

thanasisp
thanasisp

Reputation: 5975

From the GNU awk manual

By default, variables are initialized to the empty string, which is zero if converted to a number. There is no need to explicitly initialize a variable in awk, which is what you would do in C and in most other traditional languages.

This implicit way, which usually apply for scripting languages, is very comfortable but also leaves room for mistakes or confusion.


For example, in this case, you can calculate the maximum, with no need to initialise max:

awk '$2 > max{max = $2} END{print "max:", max}' file
max: 5.50

But if you do the same for the min you get the empty string as result, because min is initially zero as a number and empty as a string.

awk '$2 < min{min = $2} END{print "min:", min}' file
min: 

Also the max calculation could fail, if we had all values negative. So it would be better to assign something first time for sure.

awk 'NR==1{min=$2; next} $2<min{min = $2} END{print "min:", min}' file
min: 3.75

This way should work for both min and max, for numbers of any range. In general, when scripting, we have to think of all possible cases when our not defined and/or not initialised variable will be initialised. And for the cases that it will be tested before getting a value.

Upvotes: 5

RavinderSingh13
RavinderSingh13

Reputation: 133508

By default if you dont't assign any value to a variable in awk then it's default value will be null(without explicitly mentioning a variable we could directly assign values to it in awk), so your first time condition is getting compared with null hence it's getting true and going inside block for further statements execution(where inside block it's assigning maxrate to 2nd field).

After very first execution when variable maxrate is getting 2nd field value in it then next line onwards it's comparing 1st line's 2nd field value to current line's 2nd field and keep doing the same till all lines of Input_file are read. At last in END section of code it print it.

Upvotes: 4

Related Questions