kmulvey
kmulvey

Reputation: 51

regex to strip all comments except within parentheses

I have a regex that finds single and multiline comments just fine, but I want to exclude comments that are within parentheses. I think i need a negative lookahead but have been unable to get it to work.

regex:

(?:/\\*(?:[^*]|(?:\\*+[^*/]))*\\*+/)|(?://.*)

samples of what to ignore:

url(data:image/gif;base64,R0lGODlhBgAGAIAAAOrq6v///yH5BAAHAP8ALAAAAAYzqlgoFADs=)
src: url(//:) format('no404')

Any help would be appreciated.

Upvotes: 0

Views: 87

Answers (2)

Tim Pietzcker
Tim Pietzcker

Reputation: 336128

Making sure // is inside a pair of parentheses is not directly possible with a Java regex. But you can take the liberty of checking whether the next parenthesis is a closing one (and reject the match if it is). Of course, that only works if there are no nested parentheses and no parentheses in a comment:

//(?![^()\\r\\n]*\\)).*

would do this.

As for the first part of your regex that matches /*...*/ comments - that's a bit overcomplicated, I think. Since Java doesn't allow nested comments,

/\\*.*?\\*/

would do. You just need to make sure that the dot is allowed to match newlines in this part of the regex:

Pattern regex = Pattern.compile("(?s)/\\*.*?\\*/|(?-s)//(?![^()\r\n]*\\)).*");

Upvotes: 1

Bohemian
Bohemian

Reputation: 424993

You could get away with this:

your_regex(?![^(]*\\))

ie add a negative look ahead to the end to assert that the next bracket character is not a close bracket

Upvotes: 0

Related Questions