Reputation: 22337
I want to remove all php tags from a external text so it can be included in php safely.
this is the sample input:
<?
?>
<html>
<?php ?>
<?= ?>
</html>
<?
or any other possibilites
and output:
<html>
</html>
the last php open tag may not have an end tag!
Upvotes: 0
Views: 1522
Reputation: 13724
The proper way to do this is to not include it, but instead load it as a string, using file_get_contents()
. That will preserve the PHP tags without executing them. However, the following regex will do exactly what you asked for:
#<\?.*?(\?>|$)#s
Here's a breakdown of what that string represents:
# A delimiter marking the beginning and end of the expression - nearly anything will do (preferably something not in the regex itself) <\? Find the text "<?", which is the beginning of a PHP tag. Note that a backslash before the question mark is needed because question marks normally do something special in regular expressions. .*? Include as much text as necessary (".*"), but as little as possible ("?"). (\?>|$) Stop at an ending PHP tag ("?>"), OR the end of the text ("$"). This doesn't necessarily have to stop at the first one, but since the previous part is "as little as possible", it will. # The same delimiter, marking the end of the expression s A special flag, indicating that the pattern can span multiple lines. Without it, the regex would expect to find the entire PHP tag (beginning and end) on a single line.
Upvotes: 2
Reputation:
I don't think there is a great way to do exactly what you want, but if it's acceptable to send the PHP tags (unparsed) in the output you can just use:
<?php echo file_get_contents('input.html'); ?>
Otherwise, maybe have a look at the token_get_all method:
http://www.php.net/manual/en/function.token-get-all.php
You could iterate over all results and only return those of type T_INLINE_HTML:
$toks = token_get_all( file_get_contents( 'input.html' ) );
foreach( $toks as $tok ) {
if( $tok[0] == T_INLINE_HTML ) {
print $tok[1];
}
}
Upvotes: 3