Steve Gattuso
Steve Gattuso

Reputation: 7822

Template ifs using regex

Hey everyone, I'm working on a PHP application that needs to parse a .tpl file with HTML in it and I'm making it so that the HTML can have variables and basic if statements in it. An if statement look something like this: `

<!--if({VERSION} == 2)-->
Hello World
<!--endif -->

To parse that, I've tried using preg_replace with no luck. The pattern that I tried was

/<!--if\(([^\]*)\)-->([^<]*)<!--endif-->/e

which gets replaced with

if($1) { echo "$2"; }

Any ideas as to why this won't work and what I can do to get it up and running?

Upvotes: 0

Views: 496

Answers (3)

Bill Karwin
Bill Karwin

Reputation: 562368

Testing your regular expression, I see your backslash is applied to the square bracket. To use a backslash inside square brackets inside a quoted string, you need to escape it twice:

'/<!--if\(([^\\\]*)\)-->([^<]*)<!--endif-->/e'

But I don't know why you're inventing a new template logic framework, when solutions like Smarty and PHP itself exist.


Here's test code, in response to the comments below.

testinput.tpl:

<!--if({VERSION} == 2)-->
Hello World
<!--endif-->

match.php:

<?php
$template = file_get_contents('testinput.tpl');
print preg_match('/<!--if\(([^\\\]*)\)-->/e', $template) . "\n";
print preg_match('/<!--endif-->/e', $template) . "\n";
print preg_match('/<!--if\(([^\\\]*)\)-->([^<]*)<!--endif-->/e', $template) . "\n";

test run:

$ php match.php
1
1
1

Upvotes: 2

Alan Moore
Alan Moore

Reputation: 75232

I think you meant to do this:

'/<!--if\(([^)]*)\)-->([^<]*)<!--endif-->/'

Your regex has only one character class in it:

[^\]*)\)-->([^<]

Here's what's happening:

  • The first closing square bracket is escaped by the backslash, so it's matched literally.
  • The parentheses that were supposed close the first capturing group and open the second one are also taken literally; it isn't necessary to escape parens inside a character class.
  • The first hyphen is taken as a metacharacter; it forms the range [)*+,-]
  • The second opening square bracket is taken as a literal square bracket because it's inside a character class.
  • The second caret is taken as a literal caret because it's not the first character in the class.

So, after removing the duplicates and sorting the characters into their ASCII order, your character class is equivalent to this:

[^()*+,\-<>\[\]^]

And the parentheses outside the character class are still balanced, so the regex compiles, but it doesn't even come close to matching what you wanted it to.

Upvotes: 0

Greg
Greg

Reputation: 321678

You have a space between endif and --> but your regular expression doesn't allow this.

Incidentally, this seems horribly insecure... Is there any reason you're not using a pre-built templating engine like Smarty?

Upvotes: 3

Related Questions