manus
manus

Reputation: 41

Backslash inside and outside of regex, $1 vs \1

I am seeing some strange behavior with grouping in Perl.

Below is a file snippet I have:

nmos MNANT2(sam_1_,sam_1_,sam_1_);  
nmos MNANT1(sam[0],sam[0],sam[0]);  
nmos MNANT3(ovstb,ovstb,ovstb);
nmos M3(net14, VSS, in);  

Basically I am trying to match those lines, where all the 3 fields inside braces are same.

Was trying it out with below one liners:

perl -nle 'm/(.+?\((.+?),$2,$2\).+)/ && print $1' new

It doesn't work, but the below guy works fine:

perl -nle 'm/(.+?\((.+?),\2,\2\).+)/ && print $1' new

So, my doubt is why $2 didn't work and \2 works well here? Shouldn't we be using "$" for back references, as I have used $1 towards the end?

And, Okay, if "\" works fine everywhere, I just tried putting \1 also, instead of $1 like below:

perl -nle 'm/(.+?\((.+?),\2,\2\).+)/ && print \1' new

It returns below error:

SCALAR(0x1a49678)
SCALAR(0x1a49678)
SCALAR(0x1a49678)

What am I missing fundamentally here? Looking forward from the experts.

Upvotes: 1

Views: 136

Answers (2)

ikegami
ikegami

Reputation: 385655

You seem to think the regex patterns and Perl code are the same language. a+b in a regex pattern isn't addition, and \2 outside a regex isn't an instruction to match the second capture.


perl -nle 'm/(.+?\((.+?),$2,$2\).+)/ && print $1' new doesn't work because $2 is interpolated into the pattern before the pattern is even compiled.

perl -nle 'm/(.+?\((.+?),\2,\2\).+)/ && print $1' new works because the regex atom \2 means "match what the second capture captured."

perl -nle 'm/(.+?\((.+?),\2,\2\).+)/ && print \1' new doesn't work because \ is Perl's reference-taking operator.

Upvotes: 2

Jonathan Mee
Jonathan Mee

Reputation: 38919

The m// and print commands are separate commands joined by an &&.

Within a regex \2 is a backreference to the second capture, which will be assigned to the $2 variable after the regex has finished matching. Outside the regex \2 is meaningless; only $2 is a variable that can be accessed. See here for more info: http://perldoc.perl.org/perlretut.html#Backreferences

When reading that link, note that after Perl 5.10 \2 is still recognized but \g2 is encouraged. This is because \11 is ambiguous.

Upvotes: 1

Related Questions