erk499
erk499

Reputation: 13

When searching for words in a line from a text file, certain words aren't showing up

The following code is searching a text file by line to filter through bad lines and add the good lines to a new file. For some reason, the file is only returning lines with '-', and not responding to any of the other words.

Is there a problem with this code that might cause this to happen? Or is it more likely a problem with the text file?

import re
new=open('FilteredData.txt', 'w')
f=open('ClusteredData.txt', 'r')
line = f.readline()

while line:
    reResult = re.search(r'-',line, re.I)
    reResult1 = re.search(r'by', line, re.I)
    reResult2=re.search(r'ft', line, re.I)
    reResult3=re.search(r'feat', line, re.I)
    reResult4=re.search(r'f\.', line, re.I)

    if reResult or reResult1 or reResult2 or reResult3 or reResult4:
        new.write(line)

    line = f.readline()

Upvotes: 1

Views: 125

Answers (1)

kanghj91
kanghj91

Reputation: 140

I experienced a similar problem before due to text encoding issues. The code looks fine to me, I have ran it on a text file without any non-ascii characters, with UTF-8 encoding, and it works. Is there any gibberish in your new text file? If there is, it is likely a problem with the text file itself. Try checking that your text is encoded with the right encoding.

Maybe try running the code on a small subset of the text file and see if it works.

Upvotes: 1

Related Questions