Reputation: 305
I am in the process of learning python. Please advice if my question does not fit with the format. I would like to compare ever single lines of two txt files located in two different folders. The file name is same in both the folder. So far I have written this much of code. I would like to request somebody to help me further on this code. The last two for loops is where my confusion is, I do not know how to compare each line of two files over there.
import os
dir1 ="C:/Users/Desktop/abc1-18/"
dir2 ="C:/Users/Desktop/cde1-18/"
for files in os.listdir(dir1):
file_name1 = os.path.join(dir1,files)
if files in os.listdir(dir2):
file_name2 = os.path.join(dir2,files)
with open(file_name1, "r") as fi:
with open(file_name2,"r") as Ri:
for line1 in fi:
for line2 in Ri:
if line1==line2:
print "something"
Upvotes: 0
Views: 343
Reputation: 140148
The main issue here is that Ri
handle is exhausted in the inner loop after one iteration of the outer loop, so you have to store the lines, I'd suggest a set
for faster lookup:
with open(file_name1, "r") as fi:
with open(file_name2,"r") as Ri:
lines2 = set(Ri)
for line1 in fi:
if line1 in lines2:
print "something"
that's way faster because of set
and ... it works because the second file is read only once.
Aside, your outer loop could benefit from the same treatment. Change
for files in os.listdir(dir1):
file_name1 = os.path.join(dir1,files)
if files in os.listdir(dir2):
file_name2 = os.path.join(dir2,files)
to
file2_dir = set(os.listdir(dir2))
for files in os.listdir(dir1):
file_name1 = os.path.join(dir1,files)
if file_name1 in file2_dir:
file_name2 = os.path.join(dir2,files)
avoids to constantly scan the second directory, and putting the result in a set speeds up lookup.
Upvotes: 1