Rod
Rod

Reputation: 75

formatting colour for cell and row in excel using pandas

my requirement highlight the different of those cell to "yellow" and new rows added will be highlight in red colour in excel as a output.

load1 dataset

StudentID   Visit1      Visit 2     Visit 3     Visit 4
1           16-04-20    23-04-20    30-04-20    07-05-20
2           17-04-20    24-04-20    01-05-20    08-05-20
3           18-04-20    25-04-20    02-05-20    09-05-20
4           19-04-20    26-04-20    03-05-20    10-05-20
5           20-04-20    27-04-20    04-05-20    11-05-20
6           21-04-20    28-04-20    05-05-20    12-05-20
7           22-04-20    29-04-20    06-05-20    13-05-20
8           23-04-20    30-04-20    07-05-20    14-05-20

load2 table

StudentID   Visit1      Visit 2     Visit 3     Visit 4
1           16-04-20    23-04-20    30-04-20    07-05-20
2           17-04-20    24-04-20    01-05-20    08-05-20
3           18-04-20    25-09-20    02-05-20    09-05-20
4           19-04-20    26-04-20    03-06-20    10-05-20
5           20-04-20    27-04-20    04-05-20    11-05-20
6           21-04-20    28-04-20    05-05-20    12-06-20
7           22-04-20    29-08-20    06-05-20    13-05-20
8           23-04-20    30-04-20    07-05-20    14-05-20
9           15-05-20    16-05-20    17-05-20    18-05-20

Output. I am looking for the output in excel in the below format with colours hi-lighted.

enter image description here

I am new to this pandas, I was able to get the different difference at dataframe level. Not at the outputed excel where to format each cell and row with colours.Please help .

the below are the script i tried to get the difference.

import pandas as pd
import os
import numpy as np
colour1= pd.read_excel('pandas.xlsx',sheet_name='load1')
colour2= pd.read_excel('pandas.xlsx',sheet_name='load2')
colour_merge=colour1.merge(colour2,left_on='StudentID', right_on='StudentID',how='outer')
colour_merge['Visit1dif']= np.where(colour_merge['Visit1_x']==colour_merge['Visit1_y'],0,1)
colour_merge['Visit2dif']= np.where(colour_merge['Visit 2_x']==colour_merge['Visit 2_y'],0,1)
colour_merge['Visit3dif']= np.where(colour_merge['Visit 3_x']==colour_merge['Visit 3_y'],0,1)
colour_merge['Visit4dif']= np.where(colour_merge['Visit 4_x']==colour_merge['Visit 4_y'],0,1)
colour_merge[['StudentID','Visit1_x','Visit1_y','Visit1dif','Visit 2_x','Visit 2_y','Visit2dif','Visit 3_x','Visit 3_y','Visit3dif','Visit 4_x','Visit 4_y','Visit4dif']]

Upvotes: 1

Views: 656

Answers (1)

Barbaros Özhan
Barbaros Özhan

Reputation: 65105

I think you have two sheets load1 and load2, and want to and the third one to be displayed in the picture.

In order to add a new sheet, pandas.ExcelWriter should be opened in append mode and openpyxl library is needed along with pandas.DataFrame.to_excel and conditional_formatting functions :

import pandas as pd
import openpyxl
import xlrd

from openpyxl.styles import Alignment, Font, NamedStyle, PatternFill, colors
from openpyxl.styles.differential import DifferentialStyle
from openpyxl.formatting.rule import Rule, ColorScaleRule, CellIsRule, FormulaRule

file = r'C:\\app\\excel\\pandas.xlsx'
xls = xlrd.open_workbook(file)
sht = xls.sheet_names()

colour1= pd.read_excel(file,sheet_name=sht[0])
colour2= pd.read_excel(file,sheet_name=sht[1])
colour_merge = colour1.merge(colour2,left_on='StudentID',right_on='StudentID',how='outer')
colour_merge = colour_merge[['StudentID','Visit 1_x','Visit 1_y','Visit 2_x','Visit 2_y','Visit 3_x','Visit 3_y','Visit 4_x','Visit 4_y']]

l_sid = []
mxc1 = max(colour1['StudentID'])+1
mxc2 = max(colour2['StudentID'])+1
for j in range(min(mxc1,mxc2),max(mxc1,mxc2)):
    l_sid.append(j+1)

writer_args = { 'path': file, 'mode': 'a', 'engine': 'openpyxl'}

with pd.ExcelWriter(**writer_args) as xlsx:
    colour_merge.to_excel(xlsx, 'load3', index=False)
    ws = xlsx.sheets['load3']
    mxc= ws.max_column


    title_row = '1'

    yellow = PatternFill(bgColor=colors.YELLOW)
    red    = PatternFill(bgColor=colors.RED)    
    i=1    
    while i <= mxc*2:
        l_col = []
        l_col.append(chr(i+65))
        l_col.append(chr(i+65+1))

        for j in range(2,mxc+1):
            for k in l_col:
                if j not in l_sid:
                    ws.conditional_formatting.add(k+str(j), FormulaRule(formula=[l_col[0]+'$'+str(j)+'<>'+l_col[1]+'$'+str(j)], stopIfTrue=True, fill=yellow))

        i+=2


    r = Rule(type="expression", dxf=DifferentialStyle(fill=red), stopIfTrue=False)
    r.formula = ['1=1']

    while l_sid:
        el=l_sid.pop(-1)        
        ws.conditional_formatting.add('A'+str(el)+':'+chr(64+mxc)+str(el), r) 


xlsx.save()

Upvotes: 1

Related Questions