CRISHK Corporation
CRISHK Corporation

Reputation: 3008

Regular expression for clean javascript comments of type //

I´m using the following REGEXP:

$output = preg_replace( "/\/\/(.*)\\n/", "", $output );

The code works well BUT!!!!, when a URL like (http://this_is_not_a_comment.com/kickme), the code replaces it... (http://)

What can you do to no replace that URLs.

Thanks,

Upvotes: 1

Views: 4834

Answers (2)

CRISHK Corporation
CRISHK Corporation

Reputation: 3008

$output = preg_replace( "/(?<!\:)\/\/(.*)\\n/", "", $output );

Upvotes: 5

Gumbo
Gumbo

Reputation: 655609

You need a regular expression that can distinguish between the code and the comments. In particular, since the sequence of // can either be in a string or a comment, you just need to distinguish between strings and comments.

Here’s an example that might do this:

/(?:([^\/"']+|\/\*(?:[^*]|\*+[^*\/])*\*+\/|"(?:[^"\\]|\\.)*"|'(?:[^'\\]|\\.)*')|\/\/.*)/

Using this in a replace function while replacing the matched string with the match of the first subpattern should then be able to remove the // style comments.

Some explanation:

  • [^/"']+ matches any character that is not the begin of a comment (both //… and /*…*/) or of a string
  • /\*(?:[^*]|\*+[^*/])*\*+/ matches the /* … */ style comments
  • "(?:[^"\\]|\\.)*" matches a string in double quotes
  • '(?:[^'\\]|\\.)*' matches a string in single quotes
  • \/\/.* finally matches the //… style comments.

As the first three constructs are grouped in a capturing group, the matched string is available and nothing is changed when replacing the matched string with the match of the first subpattern. Only if a //… style comment is matched the match of the first subpattern is empty and thus it’s replaced by an empty string.

But note that this may fail. I’m not quite sure if it works for any input.

Upvotes: 8

Related Questions