Lithium2142
Lithium2142

Reputation: 23

Regex Match Whole Multiline Comment Cointaining Special Word

I've been trying to design this regex but for the life of me I could not get it to not match if */ was hit before the special word.

I'm trying to match a whole multi line comment only if it contains a special word. I tried negative lookaheads/behinds but I could not figure out how to do it properly.

This is what I have so far: (?s)(/\*.+?special.+?\*/)

Am I close or horribly off base? I tried including (?!\*/) unsuccessfully.

https://regex101.com/r/mD1nJ2/3

Edit: I had some redundant parts to the regex I removed.

Upvotes: 2

Views: 46

Answers (1)

Jan
Jan

Reputation: 43169

You were not totally off base:

/\*                 # match /*
(?:(?!\*/)[\s\S])+? # match anything lazily, do not overrun */
special             # match special
[\s\S]+?            # match anything lazily afterwards
\*/                 # match the closing */

The technique is called a tempered greedy token, see a demo on regex101.com (mind the modifiers, e.g. x for verbose mode !).


You might want to try another approach tough: analyze your document, grep the comments (using eg BeautifulSoup) and run string functions over them (if "special" in comment...).

Upvotes: 2

Related Questions