Reputation: 19164
I have a excel file in which there are some columns.
COL 1 | COL 2 | COL 3
ABCD | ABC(D) | CDA
AB CD | ABC D | C D - (B)
A B C D | (ABCD) | ABCD
ABC D | ABDC | ABC D
A(BC ) D | AD B - C| AB CD
I want to compare every column with every other column and want to print similarities and differences between columns.
for example :
comparing COL 1 and COL 2
similarities :
None
differences :
ABCD
AB CD
A B C D
A(BC ) D
ABC(D)
ABC D
(ABCD)
ABDC
AD B - C
then comparing COL 2 and COL 3 and then comparing COL 1 and COL 3. Need only exact string match, even a whitespace considered as mismatch. It may be possible that column number may increase and comparison starts from 2nd row of the column.
How can I implement such recursive comparison in Python which gives me fast processing output?
Upvotes: 1
Views: 2952
Reputation: 4912
You can use xlrd
. First of all, read content from your file. Second, save three columns into three dictionaries, since dict works faster in comparison. Third, do comparison work and output the result.
I suggest you check API of xlrd and write code by yourself. Here is link.
Any questions, feel free to ask.
EDIT:
Here is an example.
#!/usr/bin/python
#-*- coding:utf-8 -*-
name = {1:'a', 2:'b', 3:'c'}
lname = {1:'g', 2:'b', 3:'v'}
common = {}
diff_name = {}
diff_lname = {}
for key in name.keys():
if name[key] == lname[key]:
common[key] = name[key]
else:
diff_name[key] = name[key]
diff_lname[key] = lname[key]
print 'common part is:', common
print 'diff_name is: ', diff_name
print 'diff_lname is: ', diff_lname
Upvotes: 2
Reputation: 12846
An algorithm might be
for colA in range(0, N):
for colB in range (colA + 1, N - 1):
compare(colA, colB)
Upvotes: 1