alcoceba
alcoceba

Reputation: 423

Regex expression to replace a word between two differents words in a string

I would like to use a regex to replace a word ABC between two words MNO and XYZ with '', but not the occurrences of the word ABC that aren't between MNO and XYZ.

For example, given the following string:

Lorem ABC ipsum ABC bla MNO bla ipsum ABC asfg 123 hello ABC dd ABC XYZ hello ABC

The expected result would be:

Lorem ABC ipsum ABC bla MNO bla ipsum asfg 123 hello dd XYZ hello ABC

So the only ABC's replaced are the three between MNO and XYZ.

I tried some regex expressions within preg_replace in PHP, but I had no success.

For example, in this one I don't know how to not match all but ABC:

/(?<=MNO)(.*)ABC(.*)(?=XYZ)/g

Here's a test link.

I would appreciate using regex and preg_replace in this case.

Any ideas? Thanks

Upvotes: 2

Views: 182

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

You may use preg_replace_callback:

$s = preg_replace_callback('~(MNO)(.*?)(XYZ)~s', function($m) {
    return $m[1] . str_replace('ABC', 'XXX', $m[2]) . $m[3];
}, $s);

Or, with lookarounds, to make the code inside the anonymous function a bit leaner:

$s = preg_replace_callback('~(?<=MNO).*?(?=XYZ)~s', function($m) {
    return str_replace('ABC', 'XXX', $m[0]);
}, $s);

See the PHP demo

Here, (MNO)(.*?)(XYZ) matches and captures MNO, all between MNO and XYZ and then XYZ into three groups and inside the anonymous function, all ABCs are replaced only in the second group. Note the s flag at the end of the regex is required to make . match line break chars, too.

In the second example, (?<=MNO) is a positive lookbehind that does not consume text, and requires MNO to be present immedately to the left of the current location and (?=XYZ) is a positive lookahead that requires XYZ to be present immedately to the right of the current location and does not consume the text either, thus, no need for groups here.

It is much harder with a preg_replace:

preg_replace('~(?:\G(?!\A)|MNO)(?:(?!MNO).)*?\KABC(?=(?:(?!MNO).)*?XYZ)~s', 'XXX', $s)

See the regex demo.

Details

  • (?:\G(?!\A)|MNO) - end of the previous match or MNO
  • (?:(?!MNO).)*? - any 1 char, 0 or more occurrences, as few as possoble, that does not start an MNO char sequence
  • \K - match reset operator that discards the text in the match buffer
  • ABC - an ABC
  • (?=(?:(?!MNO).)*?XYZ) - immediately to the right, there must be 0+ chars, as few as possible, that do not start an MNO char sequence followed with XYZ.

Upvotes: 1

Gurmanjot Singh
Gurmanjot Singh

Reputation: 10360

This will work if there is just one occurrence of MNO and XYZ in the sentence.

Try this regex:

ABC(?!.*MNO)(?=.*XYZ)

Replace each match with a blank string

Click for Demo

Explanation:

  • ABC - matches ABC
  • (?!.*MNO) - negative lookahead to make sure that the current match is not followed by the MNO somewhere later in the string
  • (?=.*XYZ) - positive lookahead to make sure that the current match is followed by the XYZ somewhere later in the string

Code(RESULT)

$re = '/ABC(?!.*MNO)(?=.*XYZ)/m';
$str = 'Lorem `ABC` ipsum ABC bla MNO bla ipsum ABC asfg 123 hello ABC dd ABC XYZ hello ABC
';
$subst = '';
$result = preg_replace($re, $subst, $str);
echo "The result of the substitution is ".$result;

Upvotes: 1

Related Questions