Reputation: 355
I want to remove all single line comments (eg //comments
) from my code using regular expression.
By now I'm using: preg_replace('/\/\/(.*)/','',$html);
but it also removes strings like http://example.com
.
Upvotes: 4
Views: 8791
Reputation: 14135
preg_replace('/.*?:?(\/\/.*)/','',$html);
You could try something like this, but I'm sure you can safe use regular expressions to account for all possible edge cases.
However as mentioned above using a tokenizer is a better and more reliable method of doing this. In fact there is an example of how to remove comments from a PHP file in the comments on php.net man pages, see here. This could server as a good start point, but I recommended testing this for yourself. Code in the comments on php.net man pages can often be a bit dodgy.
Upvotes: 1
Reputation: 63
function stripPhpComments($code)
{
$tokens = token_get_all($code);
$strippedCode = '';
while($token = array_shift($tokens)) {
if((is_array($token) && token_name($token[0]) !== 'T_COMMENT')
|| !is_array($token))
{
$strippedCode .= is_array($token) ? $token[1] : $token;
}
}
return $strippedCode;
}
Upvotes: 0
Reputation: 5715
If you don't get any other alternative, might I suggest. Although performance wise it's not the best approach.
$lines = explode("\n", $source);
$lines = array_map(
function($line) {
return preg_replace("@\s*//.*$@", '', $line);
},
$lines
);
$source = implode("\n", $lines);
Upvotes: 1
Reputation: 61
If you want to minify your PHP code, why not use php_strip_whitespace( )?
Upvotes: 0
Reputation: 145482
You cannot do this reliably. There is no guarantee that //
at any position in a file indicates a comment in PHP context. It might very well be contained in a string for example.
It's only possible to approach this with a few concessions. For example if it is sufficient if it catches // comments
on a single line, then this would be an option with less false positives:
$source = preg_replace('#^\s*//.+$#m', "", $source);
The real solution would be utilize a language parser, but that's obviously overkill. So try with adding some heuristics to avoid removing wrong occourences.
Upvotes: 3
Reputation: 546045
Perhaps a better method would be to use the PHP engine itself, perhaps by using token_get_all()
. That function will tokenise a PHP script, so you can view it exactly as PHP views it, and hence remove or replace comments.
Doing this with a regex alone would be at best a nightmare, and most likely not possible at all.
Upvotes: 8