Reputation: 5438
I just struggled a long time to come up with a working little perl one-liner like this:
perl -pe 'if (/^(".*",").*, /) { $a = $1; s/, /"\n$a/g}'
My input data looks something like this:
"foo","bar a"
"baz","bar a, bar b, bar c"
And I'm transforming it to this:
"foo","bar a"
"baz","bar a"
"baz","bar b"
"baz","bar c"
Basically I wanted to only match certain lines (if (/, /)...
) and on those lines replace all instances of that match with a part of the original line. A s///g
with a match group would not work because it would not recurse properly, the replacement string has to be figured out before replacements start happening.
if (/^(".*",").*, /) { s/, /"\n$1/g}
Yet it did not. The var $1
was never anything but empty. Given what the perl docs I read said about persistence, this was a surprise to me:
These match variables generally stay around until the next successful pattern match.
Only when I started stashing the result in a variable of my own could I access the result from the substitution expression:
if (/^(".*",").*, /) { $a = $1; s/, /"\n$a/g}
Why was $1
being cleared when not only was there no successful match, there was no request for a match at all in my search and replace? And would there have been a better way to approach this problem?
Upvotes: 1
Views: 139
Reputation: 58651
You ask:
Why was $1 being cleared when not only was there no successful match, there was no request for a match at all in my search and replace?
Are you perhaps conflating matching and capturing?
For s/PATTERN/REPLACEMENT/
to do anything, the PATTERN must match. So if there is any substitution at all as a result of an s///
operation, you know that its PATTERN regex-matched successfully. REPLACEMENT is then evaluated.
(In your case, the s/, /.../
PATTERN matches at least once on the comma and space after the text bar a
in your second input line.)
Of course, when that happens, the interpreter will reset all the capture elements ($1
, $2
, etc.) to whatever PATTERN captured. Again, this is before REPLACEMENT is evaluated. Since your PATTERN doesn't capture anything, those elements are undefined, just as they would be if you had explicitly done a non-capturing m/, /
match.
Upvotes: 3
Reputation: 386361
The values of match variables do indeed stay around until the next successful pattern match (or until the scope in which the match occurred is exited).
In your case, they changed because there was a successful pattern match. You successfully matched against the pattern ,
. The capture variables will therefore reflect the text captured by the captures of that match. $1
returns the text matched by the non-existent first capture, so it returned undef
.
$ perl -e'
$_ = "a";
s/(a)/a/; CORE::say $1 // "[undef]"; # Successful match
s/(c)/c/; CORE::say $1 // "[undef]"; # Unsuccessful match
s/a/a/; CORE::say $1 // "[undef]"; # Successful match
'
a
a
undef
Upvotes: 4