Constant_Learner
Constant_Learner

Reputation: 45

Substitute only in the matched part of a Perl pattern

How can I substitute only in matched pattern and put it back in same variable using Perl?

For example:

my $str = "a.b.AA pat1 BB hgf AA pat1 BB jkl CC pat1 don't change pat1";

I want to match pat1 between AA and BB and replace it with Original string PAT2. However, I don't want to replace pat1 anywhere else in the same string

Expected output string:

a.b.AA PAT2 BB hgf AA PAT2 BB jkl CC pat1 don't change pat1

I am sure there should be some good way to do it; please advise.

Original string:

my $ORG_str = 'A.B.C.\\valid.A .\\valid.A.B.C .\\valid.X.Y.Z .p.q.r.s';

Expected String:

my $EXP_op = 'A.B.C.\\valid?A .\\valid?A?B?C .\\valid?X?Y?Z .p.q.r.s';

Substitute character . to ? only if it is between backslash \ and whitespace .

Upvotes: 2

Views: 1290

Answers (3)

Xtof
Xtof

Reputation: 222

Not very simple with one single regexp, so I used divide and conquer to compute the result. This is a small recursive function that is replacing a single '.' per group of ('\' ' ')

The iteration ends when there is nothing to replace

sub replace {
    my ($input) = @_;

    my $result = $input;
    $result =~ s/(\\\S*?)\.(.*? )/$1?$2/g;
    return $result if $result eq $input;
    return replace($result);
}

The function with some test cases

use strict;

my $ORG_str= 'A.B.C.\\\\valid.A .\\\\valid.A.B.C .\\\\valid.X.Y.Z .p.q.r.s';
my $EXP_op ='A.B.C.\\\\valid?A .\\\\valid?A?B?C .\\\\valid?X?Y?Z .p.q.r.s';

sub replace {
    my ($input) = @_;

    my $result = $input;
    $result =~ s/(\\\S*?)\.(.*? )/$1?$2/g;
    return $result if $result eq $input;
    return replace($result);
}

my $check;
my $result;
my $expected;

$check = 'abcd'; $expected = $check;
$result = replace($check);
assert($result eq $expected, "'$check' gives '$expected'");

$check = 'ab\xxx. cd'; $expected = 'ab\xxx? cd';
$result = replace($check);
assert($result eq $expected, "'$check' gives '$expected'");

$check = 'ab\x.x.x. cd'; $expected = 'ab\x?x?x? cd';
$result = replace($check);
assert($result eq $expected, "'$check' gives '$expected'");

$check = 'ab\x.x.x. cd\y.y.y.'; $expected = 'ab\x?x?x? cd\y.y.y.';
$result = replace($check);
assert($result eq $expected, "'$check' gives '$expected'");

$check = 'ab\x.x.x. cd\xxx.xxx..xxx...x \y.y.y.'; $expected = 'ab\x?x?x? cd\xxx?xxx??xxx???x \y.y.y.';
$result = replace($check);
assert($result eq $expected, "'$check' gives '$expected'");

$check = '. ..\.. ...\.. ...\.. ...\..'; $expected = '. ..\?? ...\?? ...\?? ...\..';
$result = replace($check);
assert($result eq $expected, "'$check' gives '$expected'");


$check = $ORG_str; $expected = $EXP_op; 
$result = replace($check);
assert($result eq $expected, "'$check' gives '$expected'");


sub assert {
    my ($cond, $mesg) = @_;
    print "checking $mesg ... ";
    die "\nFAIL: $mesg" unless $cond;
    print "OK\n";
}

The result

checking 'abcd' gives 'abcd' ... OK
checking 'ab\xxx. cd' gives 'ab\xxx? cd' ... OK
checking 'ab\x.x.x. cd' gives 'ab\x?x?x? cd' ... OK
checking 'ab\x.x.x. cd\y.y.y.' gives 'ab\x?x?x? cd\y.y.y.' ... OK
checking 'ab\x.x.x. cd\xxx.xxx..xxx...x \y.y.y.' gives 'ab\x?x?x? cd\xxx?xxx??xxx???x \y.y.y.' ... OK
checking '. ..\.. ...\.. ...\.. ...\..' gives '. ..\?? ...\?? ...\?? ...\..' ... OK
checking 'A.B.C.\\valid.A .\\valid.A.B.C .\\valid.X.Y.Z .p.q.r.s' gives 'A.B.C.\\valid?A .\\valid?A?B?C .\\valid?X?Y?Z .p.q.r.s' ... OK

Upvotes: 1

vks
vks

Reputation: 67968

\\\\[^. ]*\K|(?!^)\G\.([^. ]*)

You can try this.Replace by ?$1.See demo.

https://regex101.com/r/mT0iE7/28

The resultant string will not be exactly same as you want but you can easily do a clean up.

\?(?=\?)

Replace by empty string and you have what you want.See demo.

https://regex101.com/r/mT0iE7/29

Upvotes: 1

Chris Smeele
Chris Smeele

Reputation: 977

Look into look-around regexes.

s/(?<=AA )pat1(?= BB)/pat2/g

This matches and replaces a pat1 surrounded by AA and BB.

Upvotes: 4

Related Questions