Python iteratively grep lines after a pattern in a file

Question

I have a file that looks like:

~~~~~~~~~~~~~~~~~~~~~
Start
2, 0.001, 1.0
alpha = 0.001
beta = 1.3

...
...
...

new evaluation
complete
print out alpha & beta
alpha = 0.19
beta = 1.41
End

~~~~~~~~~~~~~~~~~~~~~
Start
....

I would like to extract the three lines after "Start" and two lines after "print out". Basically, it should be:

~~~~~~~~~~~~~~~~~~~~~
2, 0.001, 1.0
alpha = 0.001
beta = 1.3

alpha = 0.19
beta = 1.41

~~~~~~~~~~~~~~~~~~~~~

This is what I used:

summary = open("summary_accuracy.txt","w")
content = []
with open(filename,'r') as f:
    for line in f:
        if "Start" in line:
            content += [f.readline() for i in range(3)]
        if "print out" in line:
            content += [f.readline() for i in range(2)]
            content += "~~~~~~~~~~"

summary.write(content)

However, I got the error:

content += [f.readline() for i in range(3)]
ValueError: Mixing iteration and read methods would lose data

Raymond Hettinger · Accepted Answer

Try using next(f) instead of f.readline().

Also consider using regular expressions for this task:

>>> import re
>>> re.search(r'^Start\s(.*\s.*\s.*\s)', s, re.MULTILINE).group(1)
'2, 0.001, 1.0
alpha = 0.001
beta = 1.3
'
>>> re.search(r'^print out.*\s(.*\s.*\s)', s, re.MULTILINE).group(1)
'alpha = 0.19
beta = 1.41
'

Python iteratively grep lines after a pattern in a file

Answers (1)

Related Questions