sdaau
sdaau

Reputation: 38651

awk - using a regex statement (with slashes) inside action (braces)

Using:

$ awk --version
GNU Awk 3.1.7

I'm somewhat puzzled by the proper use of regex enclosed in forward slashes / inside action braces { } in awk. For instance, this works:

$ echo "4,testing" | awk -F, '/test/ {print $0}'
4,testing

Using if and match, instead of the forward slash regex syntax, also works (provided additional outer action braces are added):

$ echo "4,testing" | awk -F, '{if(match($0, "test")) {print $0}}'
4,testing

So, I guess, /REGEX/ should be equivalent to if(match($0, "test")), right?

Anyways, I want to do some testing per field - and then a regex check on the entire string... and match within nested action braces from if works as expected:

$ echo "4,testing" | awk -F, '{if($1==4) {if(match($0, "test")) {print $0}}}'
4,testing

... but then, if I want to replace the if(match(...)) with a forward slash regex, I get:

$ echo "4,testing" | awk -F, '{if($1==4) {/test/ {print $0}}}'
awk: {if($1==4) {/test/ {print $0}}}
awk:                    ^ syntax error

Can anyone explain what the rules are, when to use forward slash regex - and when to use match() regex?


While writing this, I discovered by accident that this works:

$ echo "4,testing" | awk -F, '{if($1==4) {if(/test/) {print $0}}}'
4,testing

... so it seems: within an action, the forward slash regex must be in an argument of an if statement... But that still doesn't make sense to me - given that in the very first example above, the regex is not (at least, not in a manner obvious to me) located in an if() argument?

Upvotes: 1

Views: 1510

Answers (2)

Chris Seymour
Chris Seymour

Reputation: 85875

A string inside forward slashes is a regex string in awk like /test/ not an operation just like the match() function is a function and not an operation. The syntax /test/{print $0} is short hand if ($0~/test/){print $0} where ~ is the regexp comparison operator. This is when the condition is outside the block however.

You are equivalating {if (match($0, "test")){print $0}} with {/test/{print $0}} however you still need to the if statement when inside a block:

$ echo "4,testing" | awk -F, '{if($1==4) {if (/test/){print $0}}}'
4,testing

The regexp operator ~ is what is commonly used not the match() function although it does has it's usecases.

What you really should be doing is:

$ echo "4,testing" | awk -F, '$1==4&&/test/'
4,testing

We don't need a block as the default block in awk is {print $0} and the we use the logical operator AND to test for both conditions being TRUE using &&.

Upvotes: 5

Barmar
Barmar

Reputation: 782106

The basic syntax of awk is that it's a sequence of:

<condition> <action>

The <condition> is an expression that's tested on each line; if it's true, the <action> is executed. If action is a { ... } block, it must contain statements. In a statement, to test a condition you have to use if.

Another way to think of it is that there's an implicit if around the <condition> part of each awk line.

A regexp is just a type of expression that can appear in a condition.

Upvotes: 1

Related Questions