Mat Jones
Mat Jones

Reputation: 976

Regex replace surrounding characters while maintaining the string between

I am using PHP to try to convert text from one flavor of Markdown to another flavor.

For example, if I have the string **some text**, it should be replaced with the string '''some text''' (the ** on each side gets replaced with ''' triple apostrophe). However, the string **some other text should not have any replacement done because it does not end with **

Currently, I am using the following code:

function convertBoldText($line){
    #Regex replace double asterisk IF if is FOLLOWED by a non-asterisk character
    $tmp = preg_replace('/\*{2}(?=[^\*])/', "'''", $line);
    #Regex replace double asterisk IF if is PRECEDED by a non-asterisk character
    return preg_replace('/(?<=[^\*])\*{2}/', "'''", $tmp);
  }

BUT, this code is also replacing the asterisks in strings that start but do not end with the double asterisk, which is should not.

How do I use regex to replace double asterisks if and only if the double asterisks are matched (e.g. an open and close double asterisk exist and match each other)?

The largest challenge comes from the case where you have both of the previously mentioned examples combined, like:

** these first asterisks should NOT be replaced **but these ones SHOULD**

Upvotes: 2

Views: 2830

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627535

You may use a regex to match ** that is followed with any text but ** and then followed with **:

 function convertBoldText($line){
return preg_replace('/\*{2}(?!\s)((?:(?!\*{2}).)*)(?<!\s)\*{2}/s', "'''$1'''", $line);

}

See the IDEONE demo

Regex explanation:

  • \*{2} - 2 *s
  • (?!\s) - there can be no whitespace after the two asterisks
  • ((?:(?!\*{2}).)*) - Group 1 capturing any text but **
  • (?<!\s) - there can be no whitespace before ...
  • \*{2} - two *s
  • /s - a dot matches any char and a newline, too.

A better alternative can be

return preg_replace('/\*{2}(?!\s)([^*]*(?:\*(?!\*)[^*]*)*)(?<!\s)\*{2}/', "'''$1'''", $line);

Upvotes: 3

Related Questions