Ooker
Ooker

Reputation: 3086

How to search backward several lines in Python 3?

There is a solution for searching backward inline in Python Reverse Find in String:

s.rfind('I', 0, index)

But if I need to search for a string in several lines above that line? Say I have found the keyword by using:

with open("file.txt") as f
    searchlines = f.readlines()

for i, line in enumerate(searchlines):
    if "keyword" in line: 
    do_something()

I want do_something() is to find another keyword backward. To apply the code above, I think I need to f.read() so that I can make the file as a string. But this is totally nut, since I have to readlines() and read() the (large) file. I need to use readlines() because the first keyword may appears several times in the text, and I need to find them all.

Is there any better way to do this?

image description

@engineer
- kỹ sư
@engineering
- kỹ thuật
- civil e. ngành xây dựng
- communication e. kỹ thuật thông tin
- control e. kỹ thuật [điều chỉnh, điều khiển] (tự động)
- development e. nghiên cứu những kết cấu mới

Upvotes: 1

Views: 732

Answers (1)

I'd rather approach this this way: since you want to find the line starting with @, I'd rather store all the lines in a list, then discard the previous lines if a new line that starts with @ is found.

Thus we get:

def do_something(lines):
    print("I've got:")
    print(''.join(lines))

lines = []

with open("file.txt") as f:
    for i, line in enumerate(f):
        if line.startswith('@'):
            lines = []

        lines.append(line)
        if 'development' in line:
            do_something(lines)

The output with file.txt as you have, will be:

I've got:
@engineering
- kỹ thuật
- civil e. ngành xây dựng
- communication e. kỹ thuật thông tin
- control e. kỹ thuật [điều chỉnh, điều khiển] (tự động)
- development e. nghiên cứu những kết cấu mới

In general case if you want to have just N last seen lines, you can use a collections.deque instead of a list:

from collections import deque
N = 100
last_lines = deque(maxlen=N)

with open("file.txt") as f:
    for i, line in enumerate(f):
        last_lines.append(line)
        if 'development' in line:
            do_something(last_lines)

Now the do_something will be passed up to 100 last lines including the current line, if the current line contains the word development.

Upvotes: 4

Related Questions