Peavey
Peavey

Reputation: 302

PHP: regex to find comments starting with double slashes //

I need a regex to strip single line comments from a string, but leaves URLs untouched. Code should be working with something like this:

//Some Comment on http://bobobo.com where bla < 5
<script type="text/javascript" src="http://bububu.com"></script>
<script type='text/javascript' src='http://bababa.com'></script>

EDIT: of course I do not use that kind of comment in the HTML file. Correct example would be

<script type="text/javascript">
   //Some Comment on http://bobobo.com where bla < 5
</script>
<script type="text/javascript" src="http://bububu.com"></script>
<script type='text/javascript' src='http://bababa.com'></script>

My bad, sorry for the mislead.

A possible solution should find "//Some Comment on http://bobobo.com where bla < 5", but not "//bububu.com">" and "//bababa.com'>".

Thanks for any hint...

Upvotes: 1

Views: 2361

Answers (5)

Peavey
Peavey

Reputation: 302

Thanks everyone, but finally

preg_match('!//.*?\n!', $data, $matches); 

seems to do the trick with or without spaces, tabs or new lines before the comment.

Upvotes: 1

Berry Langerak
Berry Langerak

Reputation: 18859

preg_replace( '~^\h?//(^$)~m', '', $html );

Replace // until the end of the line with '', with optional horizontal whitespace before it. Not tested, but something like that should work.

Upvotes: 0

Levi Morrison
Levi Morrison

Reputation: 19552

The short answer is: don't. The reason is that single-line comments are not valid comments in HTML. They're just text tokens. You shouldn't have them in your code. Eliminate them before they are inserted into your source.


I tried to give you an alternative answer using PHP's DomDocument and DomXPath, but it only supports XPath 1.0, and the replace function doesn't exist until 2.0. I'm not familiar enough with XPath 1.0 to be able to replace a string in the DOM. Here's what you would need to do though:

  1. Select all the text nodes (will ignore attributes because they aren't text nodes)
  2. Replace \s*//.* (dot does not match a newline) with ''.
  3. Insert the text back into the node.

Upvotes: 1

Cfreak
Cfreak

Reputation: 19309

You could also use this to strip comments that don't appear on a line by itself

/(?!http:)\/\//

Upvotes: 0

Tomalak
Tomalak

Reputation: 338278

The regex is ^//.

In preg_replace(), you would use the string '!^//!', for example. The ! is used as a regex delimiter to avoid leaning toothpick syndrome ('/^\/\//').

If your lines can start with spaces, you could use ^\s*//.

Upvotes: 0

Related Questions