sam
sam

Reputation: 19164

Excel column comparison using Python

I have a excel file in which there are some columns.

COL 1    | COL 2    | COL 3  

ABCD     |  ABC(D)  |   CDA  
AB CD    | ABC D    |   C D - (B)  
A B C D  | (ABCD)   |   ABCD  
ABC D    | ABDC     | ABC D  
A(BC ) D |  AD B - C|   AB CD

I want to compare every column with every other column and want to print similarities and differences between columns.

for example :

  1. comparing COL 1 and COL 2

    similarities :

    None
    

    differences :

    ABCD
    AB CD
    A B C D
    A(BC ) D
    ABC(D)
    ABC D
    (ABCD)
    ABDC
    AD B - C
    

then comparing COL 2 and COL 3 and then comparing COL 1 and COL 3. Need only exact string match, even a whitespace considered as mismatch. It may be possible that column number may increase and comparison starts from 2nd row of the column.

How can I implement such recursive comparison in Python which gives me fast processing output?

Upvotes: 1

Views: 2952

Answers (2)

Stephen Lin
Stephen Lin

Reputation: 4912

You can use xlrd. First of all, read content from your file. Second, save three columns into three dictionaries, since dict works faster in comparison. Third, do comparison work and output the result.

I suggest you check API of xlrd and write code by yourself. Here is link.

Any questions, feel free to ask.

EDIT:

Here is an example.

#!/usr/bin/python
#-*- coding:utf-8 -*-

name = {1:'a', 2:'b', 3:'c'}
lname = {1:'g', 2:'b', 3:'v'}
common = {}
diff_name   = {}
diff_lname  = {}


for key in name.keys():
    if name[key] == lname[key]:
        common[key] = name[key]
    else:
        diff_name[key] = name[key]
        diff_lname[key] = lname[key]

print 'common part is:', common
print 'diff_name  is: ', diff_name
print 'diff_lname  is: ', diff_lname

Upvotes: 2

schoetbi
schoetbi

Reputation: 12846

An algorithm might be

for colA in range(0, N):
     for colB in range (colA + 1, N - 1):
        compare(colA, colB)

Upvotes: 1

Related Questions