Bryan Henry
Bryan Henry

Reputation: 8408

Perl alternation match behaves differently with parentheses

Can someone explain why my match behaves differently whether or not the alternation is enclosed within a capture group?

Is this possibly due to the old version of Perl (which I have no control over, sadly), or am I misunderstanding something? My understanding was that the parentheses were a convention for some people but otherwise unnecessary in this case.

[~]$ perl -v

This is perl, v5.6.1 built for PA-RISC1.1-thread-multi
(with 1 registered patch, see perl -V for more detail)

Copyright 1987-2001, Larry Wall

Binary build 633 provided by ActiveState Corp. http://www.ActiveState.com
Built 12:17:09 Jun 24 2002


Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on
this system using `man perl' or `perldoc perl'.  If you have access to the
Internet, point your browser at http://www.perl.com/, the Perl Home Page.

[~]$ perl -e 'print "match\n" if ("getnew" =~ /^get|put|remove$/);'
match
[~]$ perl -e 'print "match\n" if ("getnew" =~ /^(get|put|remove)$/);'
[~]$

Upvotes: 1

Views: 93

Answers (2)

Miller
Miller

Reputation: 35198

By design, an "or" | is isolated to a capture group if it is enclosed in parenthesis.

The the second regex uses parenthesis around the 3 words, so it is equivalent to the following by the transitive property:

if ("getnew" =~ /^get$/ || "getnew" =~ /^put$/ || "getnew" =~ /^remove$/) {
     print "match\n" ;
}

However, the first regex has no parenthesis, so the "or" effects the entire expression including the boundary conditions. It matches because the first test, /^get/, succeeds:

if ("getnew" =~ /^get/ || "getnew" =~ /put/ || "getnew" =~ /remove$/) {
     print "match\n" ;
}

Upvotes: 2

taggon
taggon

Reputation: 1936

^get|put|remove$ finds ^get or put or remove$. So, "getnew" matches the pattern because it starts with get.

Whereas ^(get|put|remove)$ finds ^get$ or ^put$ or ^remove$.

Upvotes: 6

Related Questions