ideasman42
ideasman42

Reputation: 48058

How to match text between words using regex?

I would like to match text text between pre-processor defines in Python.

In this example I'd like to match to remove text so lines 2..4 would be removed, eg:

#if 1
#  if 0
Remove me
#  endif
Keep me
#endif

Using this regex, it removes text, but the .* doesn't stop at the first #endif:

def remove_if0(string):
    pattern = r"(^\s*#\s*if\s+0\b.*^\s*#\s*endif\b)"
    regex = re.compile(pattern, re.MULTILINE | re.DOTALL)
    return regex.sub("", string)

Is there a way to match against pairs without the DOTALL reading past a term? eg ^\s*#\s*endif\b.

I tried (?!word), eg: (?!^\s*#\s*endif\b)* - but it didn't work.

Upvotes: 0

Views: 46

Answers (1)

ideasman42
ideasman42

Reputation: 48058

The solution is to use ungreedy .*? (thanks to @bobble-bubble)

Here is a working Python function:

def remove_if0(string):
    pattern = r"(^\s*#\s*if\s+0\b.*?^\s*#\s*endif)"
    regex = re.compile(pattern, re.MULTILINE | re.DOTALL)
    return regex.sub("", string)

Upvotes: 2

Related Questions