Rick K.
Rick K.

Reputation: 85

How do I look for a particular text in file and then return it's respected file name when found in Python?

Let's say I have a text file with the following content,

f: 1.pdf
t: abc
f: 2.pdf
t: as, as
asd
f: 3.pdf
t: found
f: 4.pdf
t: .,ad
.ads
f: 5.pdf
t: ad
f: 6.pdf
t: ...

I want my python script to read this text file and if it finds the word "found" then I want to write the file name above into the output file. Like in the above example, script would write 3.pdf in the output file because below it there's a word "found".

I think it would require using a loop and regex to match the word? I have a slight idea but don't know how to begin.

Upvotes: 0

Views: 45

Answers (2)

alani
alani

Reputation: 13079

This suggested approach is based on the clarifications that the line with t: will immediately follow the line with f: , and that it is preferable to have a solution that loops through the file rather than reading it all into memory.

Regular expression parsing is not required in this situation. The only complicating factor is that pairs of lines must be considered, rather than a line at a time. This is easily addressed by storing the value of the previous line in another variable, which is copied from the current line at the end of the loop, ready for the next iteration.

previous_line = None

with open("myinput") as fin:
    with open("myoutput", "w") as fout:
        for line in fin:
            line = line.strip()
            if (line == "t: found"
                and previous_line is not None
                and previous_line.startswith("f: ")):

                fout.write(previous_line[3:] + "\n")

            previous_line = line

Because the line is pre-processed with strip, if there was any trailing whitespace after "found", this will have been removed.

Upvotes: 1

Red
Red

Reputation: 27567

You can use this context manager:

with open('text.txt','r') as s, open('output.txt','w') as f:
    lns = s.read().splitlines()
    t = [lns[i-1].split(': ')[1] for i,ln in enumerate(lns) if ln.endswith(': found')]
    f.write('\n'.join(t))







If you want it more clear:

with open('text.txt','r') as s:
    lines = s.read().splitlines()
    
files = []
for i,line in enumerate(lines):
    if line.endswith(': found'):
        files.append(lines[i-1].split(': ')[1])

with open('output.txt','w') as f:
    f.write('\n'.join(files))

Upvotes: 1

Related Questions