Reputation: 1727
I'm trying to write a PHP template engine.
Consider the following string:
@foreach($people as $person)
<p></p>
$end
I am able to use the following regex to find it:
@[\w]*\(.*?\).*?@end
But if I have this string:
@cake()
@cake()
@fish()
@end
@end
@end
The regex fails, this is what it finds:
@cake()
@cake()
@fish()
@end
Thanks in advance.
Upvotes: 0
Views: 857
Reputation: 89557
You can match nested functions, example:
$pattern = '~(@(?<func>\w++)\((?<param>[^)]*+)\)(?<content>(?>[^@]++|(?-4))*)@end)~';
or without named captures:
$pattern = '~(@(\w++)\(([^)]*+)\)((?>[^@]++|(?-4))*)@end)~';
Note that you can have all the content of all nested functions, if you put the whole pattern in a lookahead (?=...)
pattern details:
~ # pattern delimiter
( # open the first capturing group
@(\w++) # function name in the second capturing group
\( # literal (
([^)]*+) # param in the third capturing group
\) # literal )
( # open the fourth capturing group
(?> # open an atomic group
[^@]++ # all characters but @ one or more times
| # OR
(?-4) # the first capturing group (the fourth on the left, from the current position)
)* # close the atomic group, repeat zero or more times
) # close the fourth capturing group
@end
)~ # close the first capturing group, end delimiter
Upvotes: 2
Reputation: 129001
You have nesting, which takes you out of the realm of a regular grammar, which means that you can't use regular expressions. Some regular expression engines (PHP's included, probably) have features that let you recognize some nested expressions, but that'll only take you so far. Look into traditional parsing tools, which should be able to handle your work load. This question goes into some of them.
Upvotes: 0