Chirsty CR_007
Chirsty CR_007

Reputation: 69

How to capture 5 lines after Regex match using python

I Have a text starting with 3 digits of code I have written a logic to capture the current line but I need to capture the next 5 lines continuously

import re
newtxt="200 sample text with many lines\n hell01 \n hell02 \n hell03 \n hell04 \n hell05\n hell06\n hell07 \n hell08"
text = re.compile(r'^\d{3} [a-z].*')
for line in newtxt.split('\n'):
       if text.match(line):
            print(line)

Upvotes: 2

Views: 1145

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626728

You may use

r'(?m)^\d{3} [a-z].*((?:\r?\n.*){0,5})'

See the regex demo. Note (?m) can be replaced with re.M flag in the code.

Details

  • ^ - start of a line
  • \d{3} [a-z] - three digits, space and a lowercase letter
  • .* - the rest of the line
  • ((?:\r?\n.*){0,5}) - Group 1: zero to five repetitions of the line break and then the rest of the line.

Python demo:

import re
newtxt="200 sample text with many lines\n hell01 \n hell02 \n hell03 \n hell04 \n hell05\n hell06\n hell07 \n hell08"
pattern = re.compile(r'^\d{3} [a-z].*((?:\r?\n.*){0,5})', re.M)
m = pattern.search(newtxt)
if m:
  print( m.group(1) )

Output:

 hell01 
 hell02 
 hell03 
 hell04 
 hell05

Upvotes: 0

Rakesh
Rakesh

Reputation: 82755

Using iter

Ex:

import re
newtxt="200 sample text with many lines\n hell01 \n hell02 \n hell03 \n hell04 \n hell05\n hell06\n hell07 \n hell08"
text = re.compile(r'^\d{3} [a-z].*')
newtext = iter(newtxt.splitlines())
for line in newtext:
    if text.match(line):
        for _ in range(5):
            print(next(newtext))

Output:

 hell01 
 hell02 
 hell03 
 hell04 
 hell05

If you are reading this from a file object you will not require iter method. You can directly iterate the lines.

Ex:

text = re.compile(r'^\d{3} [a-z].*')
with open(filename) as infile:
    for line in infile:
        if text.match(line):
            for _ in range(5):
                print(next(infile))

Upvotes: 2

Related Questions