Gavin Miller
Gavin Miller

Reputation: 43825

Regular expression to remove JavaScript double slash (//) style comments

I'm trying to do remove JavaScript comments via a regular expression in C# and have become stuck. I want to remove any occurrences of double slash // style comments.

My current regex is (?<!:)//[^\r\n]* which will catch all comments and prevent matching of http://. However, the negative lookbehind was lazy and of course bit me back in the following test case:

var XSLPath = "//" + Node;

So I'm looking for a regular expression that will perform a lookbehind to see if an even number of double quotes (") occurs before the match. I'm not sure if this is possible. Or is there maybe a better way to do this?

Upvotes: 4

Views: 3543

Answers (1)

Steve Wortham
Steve Wortham

Reputation: 22230

(Updated based on comments)

It looks like this works pretty well:

(?<=".*".*)//.*$|(?<!".*)//.*$

It appears that the test cases in Regex Hero show that it'll match comments the way I think it should (almost).

For instance, it'll completely ignore this line:

var XSLPath = "//" + Node;

But it's smart enough to match the comment at the end of this line:

var XSLPath = "//"; // stuff to remove

However, it's not smart enough to know how to deal with 3 or more quotation marks before the comment. I'm not entirely sure how to solve that problem without hard-coding it. You need some way to allow an even number of quotes.

Upvotes: 3

Related Questions