paranormaldist
paranormaldist

Reputation: 508

Python regex both multiline and single line within same statement

I am unsure how to account for an instance in which one part of the statement uses re.M and the other uses re.S. I can't seem to find a similar question or resolution.

I have this statement:

[re.findall('(?<!^--.*)[^\s]*_[^\s]*', i) 
    for i in re.findall('\),\s+--+(.*?)as\s\(', text,flags=re.S)]

Where I would like to use re.M for the first instance and keep re.S for the second instance. So, search multiline for ?<!^--.*)[^\s]*_[^\s]* and single line for \),\s+--+(.*?)as\s\(

Sample text -

text text
),  
-- the dog jumped_over_the_moon
cat_dog as (
other text 
text 
text_text_other
text
text

This doesn't work as it seems to only use the second instance of re.S

[re.findall('(?<!^--.*)[^\s]*_[^\s]*', i, flags=.re.M) 
    for i in re.findall('\),\s+--+(.*?)as\s\(', text,flags=re.S)]

so the desired outcome is only

cat_dog

where currently the result produced is

jumped_over_the_moon
cat_dog

I would like it to skip the -- line

Upvotes: 0

Views: 61

Answers (1)

Dr. Regex
Dr. Regex

Reputation: 166

As commented by Pranav, re in Python unlike some other regular expression engines/packages requires a fixed-width pattern for lookbehind operators. So you should modify your first regex to

^[^-].*?[^\s]*_[^\s]*

Edit: According to the updated question, this should be

^[^-].*?[^\s]*_[^\s]*(?=\sas\s\()

So the final loop should look like this

[re.findall(r'^[^-].*?[^\s]*_[^\s]*(?=\sas\s\()', i, re.M) for i in re.findall(r'\),\s+--+.*?as\s\(', text, flags=re.S)]

which would return [["cat_dog"]].

Upvotes: 1

Related Questions