kooper
kooper

Reputation: 97

Python: How to check a text file, compare it to each line in another text file, and print the lines that do not match

I'm pretty stuck here. Let's say I have a text file (example.txt) that looks like this:

Generic line 1() 46536.buildsomething  
Generic line 2() 98452.constructsomething  
Something I'm interested in seeing  
Another common line() blablabla abc945  
Yet another common line() runningoutofideashere.923954  
Another line I'm interested in seeing  
Line I don't care about 1() yaddayaddayadda  
Line I don't care about 2() yaddayaddayadda  
Generic line 3() 23485.buildsomething  
Yet some other common line  

I now have an exclusion text file (exclusions.txt) containing portions of lines to not print:

Generic  
common  
don't care about

The idea is I want to open up the example.txt file, open up the exclusions.txt file, then print any line in example.txt that does not contain any line in exclusions.txt.

What I've tried so far (without any success whatsoever):

textfile = open("example.txt", "r")
textfile = textfile.readlines()

exclusionslist = []
exclusions = open("exclusions.txt", "r")
exclusions = exclusions.readlines()
for line in exclusions:
    exclusionslist.append(line.rstrip('\n'))

for excline in exclusions:
    for line in textfile:
        if exline not in line:
            print line

I think I know what the problem is, but I have no idea how to fix it. I think I just need to tell Python that if a line in textfile contains any line in exclusions, do not print it.

Upvotes: 3

Views: 7078

Answers (2)

Mark
Mark

Reputation: 1649

Seems like you would want:

textfile = open("example.txt", "r")
textfilelines = textfile.readlines()

exclusions = open("exclusions.txt", "r")
exclusionlines = exclusions.readlines()
for x in range(len(exclusionlines)):
    exclusionlines[x] = exclusionlines[x].strip("\n")

for line in textfilelines:
    found = False
    for exclude in exclusionlines:
        if exclude in line:
            found = True
    if not found:
        print line

This probably could be compressed using some magic syntax, but that'd be a lot harder to read. Depending on your output desires, you might need to strip \n from your textfilelines.

Upvotes: 1

Tim Pietzcker
Tim Pietzcker

Reputation: 336198

You're making it needlessly complicated:

with open("example.txt", "r") as text, open("exclusions.txt", "r") as exc:
    exclusions = [line.rstrip('\n') for line in exc]
    for line in text:
        if not any(exclusion in line for exclusion in exclusions):
            print line

Upvotes: 4

Related Questions