NaiTreNo
NaiTreNo

Reputation: 49

preg_replace replace only the string between two symbols if it contains something specific

I'm trying to only replace string between two symbols and start replacing just if the string contains specific word for example:

$string = '%Test% %font-style:italic; font-weight:bold;%'; //It Can be with different orders such as
$string = '%Test% %font-weight:bold; font-style:italic;%'; 

So The string which I want to use preg_replace for is the string between this two symbols %% and I want to use preg_replace just if the string contains one of css tags such as font-style:italic; color:red; font-weight:bold; etc.. I've tried

$string = preg_replace('`\%(.*?)((.*?):(.*?);)(.*?)\%`si', '(span style="$2$5")', $string); // ( used as start tag html symbol

But It caused a problem when I used it for

http://localhost/NaiTreNo/Games/Games/BatMan%20Arkham%20Knight/Image/Cover.jpg :D %color:blue; font-weight:bold;%

it should return:

http://localhost/NaiTreNo/Games/Games/BatMan%20Arkham%20Knight/Image/Cover.jpg :D
<span style="color:blue; font-weight:bold;">

But it returned:

http://localhost/NaiTreNo/Games/Games/BatMan<span style="20Arkham%20Knight/Image/Cover.jpg :D %color:blue; font-weight:bold;%">

Please help.

Upvotes: 3

Views: 783

Answers (2)

axiac
axiac

Reputation: 72256

Your regex is too loose. It only checks for the presence of : and ; somewhere inside the string. I would use the knowledge that CSS property names have a specific format to make a regex rule that won't match any string that contains : and ;.

For example, something like this:

#%(([a-z]+(-[a-z]+){0,2}: *[^;]+;)+)(.*?)%#si

A CSS property name starts with a word containing one or more lowercase letters [a-z]+, followed by zero, one or two more words, each of them preceded by a dash (-[a-z]+){0,2}.

A rule to restrict the too-accepting .*? used for values can also be created but the outcome doesn't pay the effort (and the regex becomes difficult to understand.

How the regex works:

%                      # your custom boundary start symbol
  (                    # start of group #1 used to capture the CSS rules
    (                  # start of group #2 that captures a single CSS rul
      [a-z]+           # first word of CSS property name
      (-[a-z]+){0,2}   # 0-2 more words, separated with dash (-)
      : *              # the colon followed by optional white spaces
      [^;]+;           # anything until the first semicolon (at least one character)
    )+                 # end of group #2; it can repeat; at least one occurence is required
  )                    # end of group #1
  (.*?)                # captures everything after the last semicolon
%                      # your custom boundary end symbol

The regex above doesn't match when there is only one CSS property and its value is not followed by a semicolon, f.e. %color: red%. In order to fix this, the + symbol after group #2 must be replaced with * (to match zero or more CSS rules ended with ;) but this way the ending (.*?) will match anything, including Test or the URL in your examples.

This can be fixed by replacing .*? in the last group with the content of group #2 without the ending ;. The expression becomes longer and more difficult to understand and I won't post it here. You better make sure your CSS rules always end with a semicolon (;), including the last one.

A playground for this regex can be found at: https://regex101.com/r/vC1oS2/3

Upvotes: 3

ksimka
ksimka

Reputation: 1455

I can propose to add a space at the beginning.

$string = preg_replace('`(^| )\%(.*?)((.*?):(.*?);)(.*?)\%`si', ' <span style="$2$5">', $string); // ( used as start tag html symbol

Try at https://3v4l.org/uubF7

Upvotes: 0

Related Questions