iLikeCookies
iLikeCookies

Reputation: 33

Iterate through lines of file until user input found

I am currently writing a script that is taking input from a user and then searches for it in a file. If the user input was found multiple times in the lines of the file the code should only replace the first input that was found.

This is my code:

from tempfile import mkstemp
from shutil import move, copymode
from os import fdopen, remove

def replace(file_path, pattern, subst):
    fh, abs_path = mkstemp()
    with fdopen(fh,'w') as new_file:
        with open(file_path) as old_file:
            lines = old_file.readlines()
            for line in lines:
                new_file.write(line.replace(pattern, subst))
            for lines in old_file:
                new_file.write(lines.replace(pattern, pattern))
    copymode(file_path, abs_path)
    remove(file_path)
    move(abs_path, file_path)

test_input = 'fruits'
replace_with = 'bread'
# Here, file_path is just for example purposes
replace(file_path, test_input, replace_with)

File content before running the script:

I like cookies
I like fruits
I like fruits
I like fruits

What I want the file content to look like after running my script:

I like cookies
I like bread
I like fruits
I like fruits

What it actually looks like after running the script:

I like cookies
I like bread
I like bread
I like bread

How can I fix the code to get the desired result?

Upvotes: 0

Views: 551

Answers (2)

Hai Vu
Hai Vu

Reputation: 40723

If the file is small to medium size. We can read it all at once and replace:

def replace(file_path, pattern, subst):
    with open(file_path) as stream:
        data = stream.read()

    data = data.replace(pattern, subst, 1)
    with open(file_path, "w") as stream:
        stream.write(data)

This solution does not need to use the temp file and copying. I believe this solution is simple enough, which makes it easy to understand. Note that the data.replace() call takes a number as the third parameter to tell how many times we should place text.

If the file is large that we cannot read its contents all at once, we can use the file input library to write the contents in place:

import fileinput


def replace(file_path, pattern, subst):
    replaced = False
    for line in fileinput.input(file_path, inplace=True):
        if pattern in line and not replaced:
            line = line.replace(pattern, subst, 1)
            replaced = True
        print(line, end="")

Think of fileinput.input() as a function to open the file where we can read line by line. Anything we print() in this loop will be written back to the file because of the inplace=True argument.

Upvotes: 0

David Culbreth
David Culbreth

Reputation: 2796

Before I answer the primary question, I feel it needs to be pointed out that calling readlines() on a file object puts the file pointer at the end of the file, and returns a list of str, as they were separated by \n characters. After this operation, the file pointer will be at the end of the file, so attempting to iterate over that same file will not produce subject -- meaning that the loop won't run. This is exactly what you're doing with these lines...

            # Read in the contents of old_file, divided by line, to lines.
            lines = old_file.readlines()

            ...

            # lines is now overwritten within the context of the loop, which
            # never runs, because old_file has already been read.
            for lines in old_file:

Now, to address your question, the reason it continues to do the replacement after finding a match is that you don't stop it from replacing after you find a match. A simple flag can help you achieve this. In my example, I call it replacement_found.

from tempfile import mkstemp
from shutil import move, copymode
from os import fdopen, remove

file_path = "tmp.txt"

def replace(file_path, pattern, subst):
    fh, abs_path = mkstemp()
    with fdopen(fh,'w') as new_file:
        with open(file_path) as old_file:
            replacement_found = False
            for line in old_file: # this will iterate over the file, one line at a time. consumes less memory this way.
                if replacement_found:
                    new_file.write(line)
                else:
                    new_file.write(line.replace(pattern, subst))
                    if pattern in line:
                        replacement_found = True

    copymode(file_path, abs_path)
    remove(file_path)
    move(abs_path, file_path)

test_input = 'fruits'
replace_with = 'bread'
replace(file_path, test_input, replace_with) #file_path is just for example purpose

now it outputs what you're looking for:

I like cookies
I like bread
I like fruits
I like fruits

Upvotes: 2

Related Questions