Reputation: 923
I am trying to find a regex to get the text between Explanation One: and Explanation Two:
Trick is that text may or may not exist, it could be in the same line as Explanation One or it could be in next line of Explanation One. Current regex in the below code, adds an additional line after it finds the text before Explanation Two:
Any pointers appreciated to just get the text ignoring additional empty lines.
import re
STRING="""Explanation One:
Blah Blah
Explanation Two: ndnlnlkn
"""
pattern = r'Explanation One:[\r\n ].*(?=Explanation Two:)+')'
regex = re.compile(pattern, re.IGNORECASE)
print regex.search(STRING).group()
Output:
Explanation One:
Blah Blah
Upvotes: 1
Views: 152
Reputation: 163297
To match the text between Explanation One: and Explanation Two: you could capture it in a group using the DOTALL
flag or use an inline modifier (?s)
to make the dot match a newline.
Explanation One:\s*(.*?)\s*Explanation Two
Explanation
Explanation One:
Match literally\s*
Match zero or times a whitespace character(.*?)
Capture in a group any character zero or more time non greedy\s*
Match zero or times a whitespace characterExplanation Two
Match literallyUpvotes: 2
Reputation: 521194
The problem with your current approach is that mode in which you are performing your regex is not DOT ALL mode. This means that .*
will not match across lines, which is precisely what you want it to do, until reaching the Explanation Two:
marker text. One way around this is to match the following:
[\s\S]*
This will match anything, whitespace or non whitespace, meaning it will match everything even across lines.
pattern = r'Explanation One:([\s\S]*)(?=Explanation Two:)'
searchObj = re.search(pattern, STRING, re.M|re.I)
print searchObj.group(1)
Blah Blah
By the way, an alternative would be to leave your current pattern as is, and add the re.DOTALL
flag to re.search
call. So the following should also work:
pattern = r'Explanation One:(.*)(?=Explanation Two:)'
searchObj = re.search(pattern, STRING, re.M|re.I|re.DOTALL)
print searchObj.group(1)
Upvotes: 1