Reputation: 2407
this is my first question here. I need to do what I guess is a simple php preg_replace()
replacement but I have no knowledge about regular expressions.
I have an html formated text string, breaked by several " + figure("br") + "
(including both begin and end quotes). I need to change them to <em class="br"></em>
where 'br' is the argument I must preserve.
I have about 200+ text to replace. Of course I could replace the pre and post separately, but want to do it in the right way. Thanks in advance and forgive my English.
Sample input:
<p>Bien!</p>
<p>Gana <b>Material</b> por el <b>Doble Ataque</b> al " + figure("bn") + "c6 y a la " + figure("br") + "h8.</p>
Sample output: <p>Bien!</p><p>Gana <b>Material</b> por el <b>Doble Ataque</b> al <em class="bn"></em>c6 y a la <em class="br"></em>h8.</p>
[Edited for including the real data]
Upvotes: 2
Views: 505
Reputation: 30750
I think we need a little more information about your scenario to give you something useful. The simplest way to do what you describe is to do something like:
$output = preg_replace('/.*\("br"\).*/', '<span class="br"></span>', $input);
But I don't know if that's what you actually want. That will strip out ALL the text in your initial string and replace it with <span class="br"></span>
blocks, so all you'll have left is repetitions of the string <span class="br"></span>
.
It sounds to me like what you want might be to change blocks that look like foo("bar")baz
into blocks like foo<span class="bar"></span>baz
. If that's the case, you'll probably want something like this:
$output = preg_replace('/\("(.*?)"\).*/', '<span class="$1"></span>', $input);
However, that's only my best guess on the way I read your question. To really solve the problem, we need to know a little more about what pre_string
, post_string
, and br
are supposed to represent, and how they might vary. Some sample input and output text might help, as might some info on what you're using this for.
Edit: Your latest edit makes it a little more clear, I think. It looks like you're trying to parse JavaScript or some other programming language with regular expressions, which you generally can't do perfectly due to the limitations of regex. However, the following should work in most cases:
$pattern = '/(["\'])\s*\+\s*\w+\((["\'])(.*?)\2\)\s*\+\s*\1/'
$output = preg_replace($pattern, '<span class="$3"></span>', $input);
Explanation:
/
(["\']) #Either " or '. This is captured in backreference 1 so that it can be matched later.
\s*\+\s* #A literal + symbol surrounded by any amount of whitespace.
\w+ #At least one word character (alphanumeric or _). This is "figure" in your example.
\( #A literal ( character.
(["\']) #Either " or '. This is captured in backreference 2.
(.*?) #Any number of characters, but the `?` makes it lazy so it won't match all the way to the last `") + "` in the document.
\2 #Backreference 2. This matches the " or ' from earlier. I didn't use ["\'] again because I didn't want something like 'blah" to match.
\) #A literal ) character.
\s*\+\s* #A literal + symbol surrounded by any amount of whitespace.
\1 #Backreference 1, to match the first " or ' quote in the string.
/
Hope that's relatively easy to understand. It can be hard to explain what regex patterns are doing, so I'm sorry if this is still hard to grok. Here's some more info on backreferences and lazy quantifiers if you're still confused.
I'm not sure about the backreference syntax; I don't usually code in PHP these days. If anyone wants to correct me I'd welcome it.
Upvotes: 2
Reputation: 145482
If you have a variable pre and post string (or one with meta characters as in your case), then I think it's best to use some regex escaping there:
// " + figure("br") + "
$pre = '" + figure';
$post = ' + "';
// escape
$pre = preg_quote($pre, "#");
$post = preg_quote($post, "#");
// then the regex becomes easy
$string = preg_replace(
"#$pre\(\"(\w+)\"\)$post#",
'<em class="$1"></em>',
$string
);
I assume you are converting some source code?
Upvotes: 1