compare 2 excel files by keeping 1 column fixed of 1 sheet and then comparing with another file with same col by using python

Question

We have 2 excel files one having 7.5k records and other having 7k records. We need to compare the data by keeping one specific column as fixed from one sheet to compare with another sheet.

For Example sheet1:

**Emp_ID|   Name|   Phone|  Address**
-------------------------------------
1       |     A |    123 |  ABC
-------------------------------------
2       |     B |    456 |  CBD
-------------------------------------
3       |     C |    789 |  S

For Example sheet2:

**Emp_ID|   Name|   Phone|  Address**
-------------------------------------
1       |     A |    123 |  ABC
-------------------------------------
3       |     C |    789 |  S

Python Comparison should on the basis of Emp_ID and Emp_ID=2 should be output as missing when passing the argument as Emp_ID while executing the python script. I am trying the same by using XLRD module, but its comparing only cell by cell instead of freezing one column and then comparing the row with other excel file.

def compareexcel(oldSheet, newSheet):
        rowb2 = xlrd.open_workbook(oldSheet)
        rowb1 = xlrd.open_workbook(newSheet)
        sheet1 = rowb1.sheet_by_index(0)
        sheet2 = rowb2.sheet_by_index(0)

        for rownum in range(max(sheet1.nrows, sheet2.nrows)):
            if rownum < sheet1.nrows:
                row_rb1 = sheet1.row_values(rownum)
                row_rb2 = sheet2.row_values(rownum)

                for colnum, (c1, c2) in enumerate(izip_longest(row_rb1, row_rb2)):
                    if c1 != c2:                    
                        print "Row {} Col {} - {} != {}".format(rownum+1, colnum+1, c1, c2)

JyotiGrover · Accepted Answer

I have written one function to search for a column value into another sheet and on the basis of that comparison would take place in compare function

def search(sheet2 , s):
    for row in range(sheet2.nrows):`enter code here`
        if s == sheet2.cell(row,0).value:
            return (row,0)
    return (9,9)

def compare(oldPerPaxSheet,newPerPaxSheet):
    rb1 = xlrd.open_workbook(oldPerPaxSheet)
    rb2 = xlrd.open_workbook(newPerPaxSheet)
    sheet1 = rb1.sheet_by_index(0)
    sheet2 = rb2.sheet_by_index(0)

    for rownum in range(max(self.sheet1.nrows, self.sheet2.nrows)):
            if rownum < sheet1.nrows:
                    row_rb1 = sheet1.row_values(rownum)
                    print ("row_rb1 : "), row_rb1

                    search_str = sheet1.cell(rownum,0).value

                    r,c = search(sheet2,search_str)
                    if (c != 9):
                            row_rb2 = sheet2.row_values(r)
                            for colnum, (c1, c2) in enumerate(izip_longest(row_rb1, row_rb2)):
                                    if c1 != c2:                    
                                            print "Row {} Col {} - {} != {}".format(rownum+1, colnum+1, c1, c2)
                    else:
                            print ("ROw does not exists in the other sheet")
                            pass
            else:
                    print ("Row {} missing").format(rownum+1)

compare 2 excel files by keeping 1 column fixed of 1 sheet and then comparing with another file with same col by using python

Answers (2)

Related Questions