Reputation: 4102
I was trying to find a pattern for the following scenario:
Lets say i have this string:
someString[code]some code[/code]someString
Now some code can be anything, What i want to get is reserved words (break, class, etc), So for a real scenario this is a string:
someString
[code]
class someClass{}
[/code]
someString
// And again
someString
[code]
class someClass{}
[/code]
someString
So what i was trying to understand is how can i match all the reserved words that between all of the [code][/code] tags.
For example: [code]someReservedWord some text anotherReservedWord[/code]
I only want to match someReservedWord and anotherReservedWord.
I was thinking to use preg_match_all So i can get all reserved words inside each [code][/code] and use PREG_OFFSET_CAPTURE to get their positions,
The only thing i can't figure out is the pattern, if anyone got idea i will be very thankful, Thank you all and have a nice day.
Upvotes: 0
Views: 87
Reputation: 89547
You can use this:
$pattern = <<<'LOD'
~ (?(DEFINE) (?<words> class | string | function ) )
(?: \[code] | \G(?<!^) )
(?: [^[]+? | \[(?!/code]) )*? \K
\b \g<words> \b
~x
LOD;
preg_match_all($pattern, $subject, $matches, PREG_OFFSET_CAPTURE);
print_r($matches[0]);
pattern details:
First at all we define a named group with all reserved words:
(?(DEFINE) (?<words> class | string | function ) )
The (?(DEFINE)...)
syntax allows to define subpatterns out of the pattern itself. You can call the named group "words" later in the pattern with \g<words>
.
(?: [^[]+? | \[(?!/code]) )*?
describes all the content before a reserved word. This subpattern can match all except the closing tag [/code]
because you have the choice between "all that is not a [" or "a [ not followed by /code
". Since it can match all, lazy quantifiers are used to stop the match when a reserved word is encountered.
The entry point of the pattern is (?: \[code] | \G(?<!^) )
. This enforce the match to begin with a [code]
tag or to be contiguous to a precedent match.
(\G
is an anchor that means: "at the start of the string or contiguous to a precedent match". With the negative lookbehind (?<!^)
, you forbid the start of the string.)
\K
is a trick that resets all the matched content before it from the match result.
Upvotes: 3
Reputation: 5856
$str = "someString[code]some code[/code]someString";
$ret = preg_replace('#\[code\](.+)\[\/code\]#iUs', '<FOUND>$1</FOUND>', $str);
var_dump($ret);
(http://www.phpliveregex.com/p/2tD , see preg_match_all example)
You'll might google for BB-Code PHP regex.
Upvotes: 0