T0167
T0167

Reputation: 175

How to Compare Elements of two excel sheets and add New Counts and Elements to previous sheet using Python

I have two Excel sheets. Excel-1 looks like this

Name   Sample-A    Sample-B
A1        2           3
B1        3           4

Excel-2 looks like-

Type               Solution
Sample-A               A1
Sample-A               B1
Sample-B               B1
Sample-B               A1
Sample-A               C1
Sample-A               A1

What I need to do is, get the count of each element A1,B1 for each sample and add it to Excel-1. If there is a new element in Excel-2 (in this case C1) add it to Excel-1 as well. For this particular example the result should look like this-

Name   Sample-A    Sample-B
A1        4           4
B1        4           1
C1        1           0

I need to this whole thing using Python.

So far I used dataframe.count_values() to extract the total counts from Excel-2. I used dataframe.iloc[] to create dataframe for each type (sample A and Sample B) from Excel-1. But I am not sure how to add these values or new element in `Excel-1. Sample code is given below-

df1 = pd.read_excel(File Location)  #Reading Excel-1
Sample_A_df=df1[(df1['Type']=='Sample-A')]  #Creating dataframe for each type
Sample_B_df=df1[(df1['Type']=='Sample-B')]
CountA=Sample_A_df['Solution'].value_counts() #Getting counts for each element
CountB=Sample_B_df['Solution'].value_counts()
CountAdf=CountA.to_frame() #Converting the Count series to dataframe
CountBdf=CountB.to_frame()

df2 = pd.read_excel(File Location)
AfromPrev=df2.iloc[16:43,0:2]  #iloc is used as I need values from this range only
BfromPrev=df2.iloc[16:43,[0,2]]
AfromPrevList=AfromPrev['Sample-A'].tolist() 
BfromPrevList=BfromPrev['Sample-B'].tolist()

*P.S. I am basically creating a new worksheet as I believe I can not edit an excel sheet in Python, I need to create a new workbook. *

I have only started learning python for a very few months now and I am not sure what to do next, I can not figure out the proper logic to execute this task.

Upvotes: 2

Views: 42

Answers (1)

asimo
asimo

Reputation: 2500

Assuming your Excel-1 is pulled in as df1 and Excel-2 is pulled in as df2:

df3 = pd.crosstab(df2.Solution,df2.Type)

This will make d3 dataframe look like:

Index  Sample-A Sample-B
A1     2        1
B1     1        1
C1     1        0

Then you set the column Name as index for df1 (Excel-1) using

df1 = df1.set_index('Name')

Then you concat the two dataframes(the crosstabbed one and the Excel1) using sum()

df5 = pd.concat([df1,df3]).groupby(level=0).sum()

This will make your desired df look like:

Index   Sample-A    Sample-B
A1      4           4
B1      4           5
C1      1           0

Upvotes: 1

Related Questions