Regular Expressions in AWK

Question

I am trying to parse the following input using awk patterns:

Smith, Jim 12.34

12.34 Jim Smith

I have a pattern checking to see if the first field contains an alpha character the second field contains an alpha character and the third contains a number and a second pattern checking for the second case like so:

$1 ~ /[A-Za-z]/ && $2 ~ /[A-Za-z]/ && $3 ~ /[0-9]/{
do fun things with record
}
$3 ~ /[A-Za-z]/ && $2 ~ /[A-Za-z]/ && $1 ~ /[0-9]/
{
this is the second form of the record
}

however, my program appears to be passing both checks and executing both actions. I have been trying to figure out where I am messing up but the same thing keeps happening. Any points in the right direction is much appreciated. I know there are tons of ways to do this. A few of which I have found, but I would like to know specifically what I am doing wrong here.

I'm running CentOS 7 with awk:

gawk --version
GNU Awk 4.0.2

matz · Accepted Answer

The problem is the newline before the opening braces after the second pattern. This will work as expected:

$1 ~ /[A-Za-z]/ && $2 ~ /[A-Za-z]/ && $3 ~ /[0-9]/{
 print "do fun things with record"
}
$3 ~ /[A-Za-z]/ && $2 ~ /[A-Za-z]/ && $1 ~ /[0-9]/{ # NO newline here
 print "this is the second form of the record"
}

Explanation: An AWK program consists of a sequence of pairs pattern { action }, where either the pattern or the action can be omitted. Adding a newline between pattern and action will make awk parse that as a pattern with no action, followed by an action without pattern (i.e, an action that is executed unconditionally).

Bottomline: stick to Egyptian Brackets in AWK.

Regular Expressions in AWK

Answers (2)

Related Questions