romesub
romesub

Reputation: 233

Python script to remove lines from file containing words in array

I have the following script which identifies lines in a file which I want to remove, based on an array but does not remove them.

What should I change?

sourcefile = "C:\\Python25\\PC_New.txt" 
filename2 = "C:\\Python25\\PC_reduced.txt"

offending = ["Exception","Integer","RuntimeException"]

def fixup( filename ): 
    print "fixup ", filename 
    fin = open( filename ) 
    fout = open( filename2 , "w") 
    for line in fin.readlines(): 
        for item in offending: 
                print "got one",line 
                line = line.replace( item, "MUST DELETE" ) 
                line=line.strip()
                fout.write(line)  
    fin.close() 
    fout.close() 

fixup(sourcefile)

Upvotes: 1

Views: 8007

Answers (4)

zifot
zifot

Reputation: 2688

sourcefile = "C:\\Python25\\PC_New.txt" 
filename2 = "C:\\Python25\\PC_reduced.txt"

offending = ["Exception","Integer","RuntimeException"]

def fixup( filename ): 
    fin = open( filename ) 
    fout = open( filename2 , "w") 
    for line in fin: 
        if True in [item in line for item in offending]:
            continue
        fout.write(line)
    fin.close() 
    fout.close() 

fixup(sourcefile)

EDIT: Or even better:

for line in fin: 
    if not True in [item in line for item in offending]:
        fout.write(line)

Upvotes: 5

SteamerX
SteamerX

Reputation: 1

'''This is a rather simple implementation but should do what you are searching for'''

sourcefile = "C:\\Python25\\PC_New.txt"

filename2 = "C:\\Python25\\PC_reduced.txt"

offending = ["Exception","Integer","RuntimeException"]

def fixup( filename ): 

    print "fixup ", filename 
    fin = open( filename ) 
    fout = open( filename2 , "w") 
    for line in fin.readlines(): 
        for item in offending: 
                print "got one",line 
                line = line.replace( item, "MUST DELETE" ) 
                line=line.strip()
                fout.write(line)  
    fin.close() 
    fout.close() 

fixup(sourcefile)

Upvotes: 0

steveha
steveha

Reputation: 76695

The basic strategy is to write a copy of the input file to the output file, but with changes. In your case, the changes are very simple: you just omit the lines you don't want.

Once you have your copy safely written, you can delete the original file and use 'os.rename()' to rename your temp file to the original file name. I like to write the temp file in the same directory as the original file, to make sure I have permission to write in that directory and because I don't know if os.rename() can move a file from one volume to another.

You don't need to say for line in fin.readlines(); it is enough to say for line in fin. When you use .readlines() you are telling Python to read every line of the input file, all at once, into memory; when you just use fin by itself you read one line at a time.

Here is your code, modified to do these changes.

sourcefile = "C:\\Python25\\PC_New.txt" 
filename2 = "C:\\Python25\\PC_reduced.txt"

offending = ["Exception","Integer","RuntimeException"]

def line_offends(line, offending):
    for word in line.split():
        if word in offending:
            return True
    return False

def fixup( filename ): 
    print "fixup ", filename 
    fin = open( filename ) 
    fout = open( filename2 , "w") 
    for line in fin:
        if line_offends(line, offending):
            continue
        fout.write(line)
    fin.close()
    fout.close()
    #os.rename() left as an exercise for the student

fixup(sourcefile)

If line_offends() returns True, we execute continue and the loop continues without executing the next part. That means the line never gets written. For this simple example, it would really be just as good to do it this way:

    for line in fin:
        if not line_offends(line, offending):
            fout.write(line)

I wrote it with the continue because often there is non-trivial work being done in the main loop, and you want to avoid all of it if the test is true. IMHO it is nicer to have a simple "if this line is unwanted, continue" rather than indenting a whole bunch of stuff inside an if for a condition that might be very rare.

Upvotes: 2

Sam Dolan
Sam Dolan

Reputation: 32532

You're not writing it to the output file. Also, I would use "in" to check for the string existing in the line. See the modified script below (not tested):

sourcefile = "C:\\Python25\\PC_New.txt" 
filename2 = "C:\\Python25\\PC_reduced.txt"

offending = ["Exception","Integer","RuntimeException"]

def fixup( filename ): 
    print "fixup ", filename 
    fin = open( filename ) 
    fout = open( filename2 , "w") 

    for line in fin.readlines(): 
        if not offending in line:
            # There are no offending words in this line
            # write it to the output file
            fout.write(line)

    fin.close() 
    fout.close() 

fixup(sourcefile)

Upvotes: 0

Related Questions