Regular expression to capture n lines of text between two regex patterns

Question

Need help with a regular expression to grab exactly n lines of text between two regex matches. For example, I need 17 lines of text and I used the example below, which does not work. I

Please see sample code below:

import re
match_string = re.search(r'^.*MDC_IDC_RAW_MARKER((.*?
?
){17})Stored_EGM_Trigger.*
'), t, re.DOTALL).group()
value1 = re.search(r'value="(\d+)"', match_string).group(1)
value2 = re.search(r'value="(\d+\.\d+)"', match_string).group(1)
print(match_string)
print(value1)
print(value2)

I added a sample string to here, because SO does not allow long code string: https://hastebin.com/aqowusijuc.xml

Booboo · Accepted Answer

You are getting false positives because you are using the re.DOTALL flag, which allows the . character to match newline characters. That is, when you are matching ((.*? ? ){17}), the . could eat up many extra newline characters just to satisfy your required count of 17. You also now realize that the is superfluous. Also, starting your regex with ^.*? is superfluous because you are forcing the search to start from the beginning but then saying that the search engine should skip as many characters as necessary to find MDC_IDC_RAW_MARKER. So, a simplified and correct regex would be:

match_string = re.search(r'MDC_IDC_RAW_MARKER.*
((.*
){17})Stored_EGM_Trigger.*
', t)

Regex Demo

Regular expression to capture n lines of text between two regex patterns

Answers (1)

Related Questions