Reputation: 175
I have two Excel sheets
. Excel-1
looks like this
Name Sample-A Sample-B
A1 2 3
B1 3 4
Excel-2
looks like-
Type Solution
Sample-A A1
Sample-A B1
Sample-B B1
Sample-B A1
Sample-A C1
Sample-A A1
What I need to do is, get the count of each element A1,B1
for each sample and add it to Excel-1
. If there is a new element in Excel-2
(in this case C1
) add it to Excel-1
as well. For this particular example the result should look like this-
Name Sample-A Sample-B
A1 4 4
B1 4 1
C1 1 0
I need to this whole thing using Python
.
So far I used dataframe.count_values()
to extract the total counts
from Excel-2
. I used dataframe.iloc[]
to create dataframe
for each type (sample A and Sample B)
from Excel-1
. But I am not sure how to add these values or new element in `Excel-1.
Sample code is given below-
df1 = pd.read_excel(File Location) #Reading Excel-1
Sample_A_df=df1[(df1['Type']=='Sample-A')] #Creating dataframe for each type
Sample_B_df=df1[(df1['Type']=='Sample-B')]
CountA=Sample_A_df['Solution'].value_counts() #Getting counts for each element
CountB=Sample_B_df['Solution'].value_counts()
CountAdf=CountA.to_frame() #Converting the Count series to dataframe
CountBdf=CountB.to_frame()
df2 = pd.read_excel(File Location)
AfromPrev=df2.iloc[16:43,0:2] #iloc is used as I need values from this range only
BfromPrev=df2.iloc[16:43,[0,2]]
AfromPrevList=AfromPrev['Sample-A'].tolist()
BfromPrevList=BfromPrev['Sample-B'].tolist()
*P.S. I am basically creating a new worksheet as I believe I can not edit an excel sheet in Python, I need to create a new workbook. *
I have only started learning python for a very few months now and I am not sure what to do next, I can not figure out the proper logic to execute this task.
Upvotes: 2
Views: 42
Reputation: 2500
Assuming your Excel-1 is pulled in as df1 and Excel-2 is pulled in as df2:
df3 = pd.crosstab(df2.Solution,df2.Type)
This will make d3 dataframe look like:
Index Sample-A Sample-B
A1 2 1
B1 1 1
C1 1 0
Then you set the column Name as index for df1 (Excel-1) using
df1 = df1.set_index('Name')
Then you concat the two dataframes(the crosstabbed one and the Excel1) using sum()
df5 = pd.concat([df1,df3]).groupby(level=0).sum()
This will make your desired df look like:
Index Sample-A Sample-B
A1 4 4
B1 4 5
C1 1 0
Upvotes: 1