Reputation: 1135
The below code works perfect on XAMPP on my PC, but does not work on my newly bought VPS. It crashed my code.
preg_match_all( "/$regex/siU" , $string , $matches , PREG_SET_ORDER );
This is expected to simply fetch links and titles from HTML.
Previously, a similar regex problem occurred today. Code was running fine on local server, but creating "Connection Was Reset" error on vps. The problem was caused by some commented html (having php code inside it) that was removed using the below code to optimize output, but even the problem of connection reset is resolved, HTML still has comments in browser source.
$string = preg_replace( '/<!--(.|\s)*?-->/' , '' , $string );
So, problem is clear. These regex functions are not working fine. But i do not know the solution.
Can anyony help me in solving this.
Solved:
Thanks to https://stackoverflow.com/a/12761686/369005 @vimishor
Upvotes: 4
Views: 1320
Reputation: 75242
So the root problem is that the code that's supposed to remove HTML comments isn't working? That's probably because the regex that's supposed to match the comments uses (.|\s)*
to work around the fact that .
doesn't match newlines. That's almost guaranteed to cause problems, as this answer explains.
The correct way to match anything-including-newlines is to use the s
modifier. For example:
'/<!--.*?-->/s'
That turns on single-line mode (also known as DOTALL mode), which allows the .
to match newlines. (The author of that other question had to use [\S\s]
instead, because JavaScript has no equivalent for single-line/DOTALL mode.)
Upvotes: 1
Reputation: 91762
It seems the problem is you are misunderstanding what html comments do. According to your comment below your question, the problem is that html comments were not removed, causing php to run with the wrong parameters.
However, html comments have no influence on php code that is or is not run, only on what the browser displays (and runs in case of javascript). Your php code is run before the output gets to the browser.
If you want to comment php code out, you will need to put in in a /* */
block or start each line with //
.
Upvotes: 0
Reputation: 173642
Let me stop you there for a second. Parsing HTML with regular expressions is a bad idea, unless it's a very isolated issue on a malformed document. You will want to use a proper parser; for instance, here's an example that strips HTML comments:
$html = <<<EOM
<html>
<body>
<div id="test">
<!--
comment here
-->
</div>
</body>
</html>
EOM;
$d = new DOMDocument;
$d->loadHTML($html);
$x = new DOMXPath($d);
foreach ($x->query('//comment()') as $node) {
$node->parentNode->removeChild($node);
}
echo $d->saveHTML();
Upvotes: 1
Reputation: 5726
Is known the fact that PCRE has sometimes a few problems with text larger than 200 lines. Developers from Drupal and GeSHi were hit by this problem in the past.
References:
Maybe if you can split the text into small chunks (100 lines for example) and run regex on each chunk, may help.
Upvotes: 2
Reputation: 97
Try this:
$string = preg_replace( '/.*<!--(.|\s)*?-->.*/' , '' , $string );
Some regex implementations will execute your regular expression like this: /^<!--(.|\s)*?-->$/
. So your expression may behave different on different servers.
Upvotes: -1