Reputation: 8564
I have written the code below to compare a file "(F) with several other files that are in my path. Now the result only prints the result of one file. Any suggestion how to perform the comparison and print all of the results?
import difflib
import fnmatch
import os
filelist=[]
f= open("D:/Desktop/data/sample/ff69c.txt")
flines= f.readlines()
path="D:/Desktop/data/sample/sample2"
for root, dirnames, filenames in os.walk(path):
for filename in fnmatch.filter(filenames, '*.txt'):
filelist.append(os.path.join(root, filename))
for m in filelist:
g=open(m,'r')
glines= g.readlines()
# g.close()
d = difflib.Differ()
diff_list = list(d.compare(flines, glines))
#print("".join(diff))
n_adds, n_subs, n_eqs, n_wiered = 0, 0, 0, 0
for diff_item in diff_list:
if diff_item[0] == '+':
n_adds += 1
elif diff_item[0] == '-':
n_subs +=1
elif diff_item[0] == ' ':
n_eqs += 1
else:
n_wiered += 1
print 'lines files #1: %d #2: %d' % (len(flines), len(glines))
print 'adds: %d subs: %d eqs: %d ?:%d ' % (n_adds, n_subs, n_eqs, n_wiered)
Upvotes: 0
Views: 108
Reputation: 16827
If you just want to compare the files you can use filecmp.cmp. It will avoid having to read all the content in with readlines
. Documentation:
filecmp.cmp(f1, f2[, shallow])
Compare the files named f1 and f2, returning True if they seem equal, False otherwise.
Unless shallow is given and is false, files with identical os.stat() signatures are taken to be equal. Files that were compared using this function will not be compared again unless their os.stat() signature changes. Note that no external programs are called from this function, giving it portability and efficiency.
Also to explore all the file combinations you can use itertools.combinations (with r=2
):
itertools.combinations(iterable, r)
Return r length subsequences of elements from the input iterable.
Combinations are emitted in lexicographic sort order. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.
Upvotes: 1
Reputation: 1143
diff_list
is overridden with each file read.
Try appending to diff_list
rather than overwriting it with this line:
diff_list = list(...)
Upvotes: 2