Reputation: 3162
I have several regexp with capturing and obviously the capturing variable retains the value of the last valid capturing:
# Two scalars to use for regexp
$x = 'abc';
$y = 'def';
# first regexp
$x =~ /^(ab)/;
$x = $1;
# second regexp
$y =~ /^(de)/;
$y = $1;
print \"$x\n$y\";
The output is:
ab
de
Here the one-liner version:
perl -e "$x='abc'; $y='def'; $x =~ /^(ab)/; $x=$1; $y =~ /^(de)/; $y=$1; print \"$x\n$y\";"
If $y='def'
is changed to $y='zdef'
:
perl -e "$x='abc'; $y='zdef'; $x =~ /^(ab)/; $x=$1; $y =~ /^(de)/; $y=$1; print \"$x\n$y\";"
the output is:
ab
ab
If I want to set $1=undef
after $x=$1
to remove the current value in $1
perl -e "$x='abc'; $y='zdef'; $x =~ /^(ab)/; $x=$1; $1=undef; $y =~ /^(de)/; $y=$1; print \"$x\n$y\";"
the output is:
Modification of a read-only value attempted at -e line 1.
Obviously, capturing variables can't be changed.
I'm wondering how I can cope with this problem. The result I would like to have is:
ab
..
where ..
means "empty". Like in this case where the first regexp is undef ($x='zabc
):
perl -e "$x='zabc'; $y='def'; $x =~ /^(ab)/; $x=$1; $y =~ /^(de)/; $y=$1; print \"$x\n$y\";"
..
de
Upvotes: 1
Views: 183
Reputation: 386361
Replace
$y =~ /^(de)/;
$y = $1;
with
( $y ) = $y =~ /^(de)/;
or
$y = $y =~ /^(de)/ ? $1 : undef;
The former relies on the fact that the match operator returns the sequences it captured when called in list context.
The latter relies on the fact that the match operator returns whether the match was successful or not when called in scalar context.
Upvotes: 1
Reputation: 67900
You need to use the capture variables $1
(and $2
, $3
, etc) carefully. They are assigned at successful pattern matches (and unassigned), so you have to make sure you have the right match.
man perlvar
states (the emphasis is on successful):
$<digits> ($1, $2, ...)
Contains the subpattern from the corresponding set of capturing
parentheses from the last successful pattern match, ...
Typically, you would do this:
if ('abc' =~ /^(ab)/) {
$x = $1;
}
if ('zdef' =~ /^(de)/) {
$y = $1;
}
This way, you never get the wrong value assigned.
There are, however, other ways to do this. The pattern match itself gives a return value, which depends on the context.
$n = 'abc' =~ /^(ab)/; # $n = 1 for "true". This is scalar context
($n) = 'abc' =~ /^(ab)/; # $n = 'ab', the captured string. This is list context
$n = () = 'abc' =~ /(.)/g; # $n = 3, for 3 matches. /g gives multiple matches
($f, $g) = 'abc' =~ /(.)/g; # $f = 'a', $g = 'b'. List context
Upvotes: 4
Reputation: 21
It is common for perl regex to use global variables. And if there is no capture, $1 will be the last success captured group.
As I say, it is common, and it is the way perl works.
What can you do? First, get all captured groups to array like:
@captures = $y =~ /^(de)/;
And then work with it.
Second, use ternar statements:
$y = $y =~ /(ho)/ ? $1 : undef;
Or you can consider this package https://metacpan.org/pod/Regex::Object It helps with this sort of things. But you will need some basic knowledge about CPAN and Objects.
Upvotes: 2