Reputation: 23
I'm looking for a way regex that will get me everything in a piece of text up to the first blank line. I have the following:
reg = r'((Opposition|Oppose):?\s*)(.*?)\n\n'
str1 = """Opposition
California Attorneys for Criminal Justice
Californians for Safety and Justice
Drug Policy Alliance
Friends Committee on Legislation of California
Legal Services for Prisoners with Children
Analysis Prepared
"""
str2 = """Oppose: None received
-- END --
"""
When I run:
match = re.search(reg, str1, re.DOTALL)
print ma
tch.group(3)
I get:
California Attorneys for Criminal Justice
Californians for Safety and Justice
Drug Policy Alliance
Friends Committee on Legislation of California
Legal Services for Prisoners with Children
But when I run:
match = re.search(reg, str2, re.DOTALL)
print match.group(3)
I get:
None received
-- END --
The the outcome for the first string is correct, but what I want from the second string is just the "None received". I can't come up with a good explanation for why I get the "-- END --" as well. Shouldn't my regex match the \n after "None received" as well as the \n on the blank line and stop? Any help would be appreciated
Upvotes: 2
Views: 1719
Reputation: 627101
You can make sure you match whitespace-only lines with [^\S\n]*
(= match 0 or more characters other than non-whitespace or newlines):
((Oppos(?:e|ition)):?\s*)(.*?)\n[^\S\n]*\n[^\S\n]*
See demo
I also shortened the 2nd capture group a bit.
Here is an IDEONE demo
Upvotes: 1