IAMeNTROPY
IAMeNTROPY

Reputation: 15

Match pattern on new lines after first pattern is found?

I have the following representative data:

Lots of text, lots of text
PATTERN2
PATTERN2
text PATTERN1 text
text
text
..
..
text
PATTERN2
PATTERN2
PATTERN2
PATTERN2
PATTERN2
..
..
PATTERN2

Basically I want to capture all of the instances of PATTERN2 but only after PATTERN1 shows up in the file.

PATTERN1 is a few characters, and PATTERN2 starts with a Timestamp (HH:MM:SS.sss) and I need to capture the entire line when PATTERN2 is found. Also worth noting that PATTERN2 shows up all over the txt file, but I only want to match PATTERN2 after PATTERN1 has been found.

I've tried various regex expressions (I'm a newb and am fumbling) and to no avail, and I'm testing using https://regexr.com/ and https://regex101.com to test, but ultimately its going to be used in a Python script.

Any help would be greatly appreciated!

Upvotes: 0

Views: 289

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520958

One approach makes judicious use of the base string functions:

inp = """Lots of text, lots of text
PATTERN2
PATTERN2
text PATTERN1 text
text
text
..
..
text
PATTERN2
PATTERN2
PATTERN2
PATTERN2
PATTERN2
..
..
PATTERN2"""

matches = []
if re.search(r'\bPATTERN1\b', inp):
    text = re.split(r'\bPATTERN1\b', inp, 1)[1]
    matches = re.findall(r'\bPATTERN2\b', text)

print(matches)
# ['PATTERN2', 'PATTERN2', 'PATTERN2', 'PATTERN2', 'PATTERN2', 'PATTERN2']

Here we first check that the input text contain the PATTERN1 marker. If not, then there are no matches, otherwise, we do a regex split to find the text occurring after the first PATTERN1 occurrence. Finally, re.findall finds all the PATTERN1 occurrences in this target text.

Upvotes: 1

Related Questions