Homap
Homap

Reputation: 2214

awk variable used before defined to determine the longest line

Given the following file:

>tna
ATGC
>ggf
TG
>gta
TGGTT

I want to find the longest line that does not start with >. I've figured out the AWK code:

awk '!/>/ {len=length($0)} len > maximum {maximum=len} END{print maximum}' file

I cannot explain, however, how maximum is used before being defined. First, I make the comparison to len and then I set maximum to len. How does AWK know what maximum is?

Thanks!

Upvotes: 2

Views: 75

Answers (3)

paxdiablo
paxdiablo

Reputation: 881113

By default, any string variables that are uninitialised are the empty string "" and any numerics are zero (they're actually all empty strings but those are treated as zero in a numeric context).

But you might also want to consider the fact that />/ will match all lines containing >, not starting with it. You would be better off with something like (readable):

!/^>/ {
    len = length($0)
    if (len > max) {
        max = len
    }
}
END {
    print max
}

or, in its minimalist form:

!/^>/{len=length($0);if(len>max){max=len}}END{print max}

Upvotes: 5

Inian
Inian

Reputation: 85530

Awk is a dynamically typed language, their types change from "untyped" to string or number depending on the context it is used on. By default, variables are initialized to the empty string, which if used in integer context will be value zero. So you are essentially comparing against zero value of maximum.

See 6.1.3.1 Using Variables in a Program and 6.3.2.1 String Type versus Numeric Type

Upvotes: 3

RavinderSingh13
RavinderSingh13

Reputation: 133428

Could you please try following, written and tested with shown samples in GNU awk.

awk '!/^>/{len=length($0);max=(len>max?len:max)} END{print max}' Input_file

Explanation: Adding detailed explanation for above.

awk '                     ##Starting awk program from here.
!/^>/{                    ##Checking condition if line does not starts from > then do following.
  len=length($0)          ##Creating len which has length of current line.
  max=(len>max?len:max)   ##Creating max which has either current line length OR max previous value depending upon which is greater.
}
END{                      ##Starting END block of this program from here.    
  print max               ##Printing max value here.
}
' Input_file              ##Mentioning Input_file name here.

Upvotes: 2

Related Questions