Reputation: 15375
I'm falling deeper into the regex's dark side. I need to parse this:
{{word(a|b|c)|word$1}}
{{word(s?)|word$1}}
{{w(a|b|c)ord(s?)|w$1ord$2}}
As you may have noticed, it is a search & replace scheme, containing regular expressions. The wikimedia engine does it very well, but I couldn't find how it does: right here.
I just need to get the first part, and the second part into two seperated variables. For instance:
preg_match(REGEX, "{{word(a|b|c)|word$1}}", $result) // Applying REGEX on this
echo $result[1] // word(a|b|c)
echo $result[2] // word$1
How would you do ? It's like regex in regex, I'm completely lost...
Upvotes: 1
Views: 89
Reputation: 225144
It really depends on how deep the nesting can be, but you can just split it by |
, taking care not to split it by any |
within parentheses. Here's the easy way, I suppose:
$str = 'word(a|b|c)|word$1'; // Trim off the leading and trailing {{ and }}
$items = explode('|', $str);
$realItems = array();
for($i = 0; $i < count($items); $i++) {
$realItem = $items[$i];
while(substr_count($realItem, '(') > substr_count($realItem, ')')) {
// Glue them together and skip one!
$realItem .= '|' . $items[++$i];
}
$realItems[] = $realItem;
}
Now $realItems[]
contains your 2-4 key values, which you can simply pass into preg_replace
; it'll do all the work for you.
Upvotes: 1
Reputation: 33928
You could match the parts using something like:
{{((?:(?!}}).)+)\|([^|]+?)}}
Note that if you are allowing arbitrary PCRE regex then some very complex and slow patterns can be constructed, possibly allowing simple DoS attacks on your site.
Upvotes: 2
Reputation: 121820
It is actually not that hard.
The thing is, the replacement string will only ever contain an escaped |
, ie \|
.
And for one of these very few occasions, .*
will actually be useful here.
Do: preg_match("^{{(.*)\|([^|]+(?:\\\|[^|]*)*)}}$", $result);
, this should do what you want.
The trick here is the second group: it is, again, the normal* (special normal*)*
pattern, where normal
is [^|]
(anything but a pipe), and special
is \\\|
(a backslash followed by a pipe).
Upvotes: 0