Reputation: 463
I am trying to parse the following input using awk patterns:
Smith, Jim 12.34
12.34 Jim Smith
I have a pattern checking to see if the first field contains an alpha character the second field contains an alpha character and the third contains a number and a second pattern checking for the second case like so:
$1 ~ /[A-Za-z]/ && $2 ~ /[A-Za-z]/ && $3 ~ /[0-9]/{
do fun things with record
}
$3 ~ /[A-Za-z]/ && $2 ~ /[A-Za-z]/ && $1 ~ /[0-9]/
{
this is the second form of the record
}
however, my program appears to be passing both checks and executing both actions. I have been trying to figure out where I am messing up but the same thing keeps happening. Any points in the right direction is much appreciated. I know there are tons of ways to do this. A few of which I have found, but I would like to know specifically what I am doing wrong here.
I'm running CentOS 7 with awk:
gawk --version
GNU Awk 4.0.2
Upvotes: 3
Views: 889
Reputation: 656
The problem is the newline before the opening braces after the second pattern. This will work as expected:
$1 ~ /[A-Za-z]/ && $2 ~ /[A-Za-z]/ && $3 ~ /[0-9]/{
print "do fun things with record"
}
$3 ~ /[A-Za-z]/ && $2 ~ /[A-Za-z]/ && $1 ~ /[0-9]/{ # NO newline here
print "this is the second form of the record"
}
Explanation: An AWK program consists of a sequence of pairs pattern { action }
, where either the pattern or the action can be omitted. Adding a newline between pattern and action will make awk parse that as a pattern with no action, followed by an action without pattern (i.e, an action that is executed unconditionally).
Bottomline: stick to Egyptian Brackets in AWK.
Upvotes: 5
Reputation: 67497
If your fields include both alpha and numerical values it will pass both tests. For example.
$ echo "James007" | awk '/[a-zA-Z]/{print "alpha"} /[0-9]/{print "number"}'
will print both. If you want to restrict to only alpha and number you can do this
$ echo "James 007" | awk '$1~/^[a-zA-Z]+$/{print "alpha"} $2~/^[0-9]+$/{print "number"}'
Upvotes: 1