Reputation: 13
I call Excel data with the tuples Time, Name, Good, Bad using python and pandas.
I want to reprocess dataframe to another dataframe that meet certain conditions.
In detail, i would like to print out a dataframe that stores the sum of Good and Bad data for each Name during the entire time.
please help me anybody who knows well python, pandas.
Upvotes: 1
Views: 56
Reputation: 863501
First aggregate sum
by DataFrame.groupby
, change columns names by DataFrame.add_prefix
, add new column by DataFrame.assign
and last convert index to column by DataFrame.reset_index
:
df = pd.DataFrame({
'Name':list('aaabbb'),
'Bad':[1,3,5,7,1,0],
'Good':[5,3,6,9,2,4]
})
df1 = (df.groupby('Name')['Good','Bad']
.sum()
.add_prefix('Total_')
.assign(Total_Count = lambda x: x.sum(axis=1))
.reset_index())
print (df1)
Name Total_Good Total_Bad Total_Count
0 a 14 9 23
1 b 15 8 23
Upvotes: 2
Reputation: 153510
Use pandas NamedAgg with eval
,
df.groupby('Name')[['Good', 'Bad']]\
.agg(Total_Good=('Good','sum'),
Total_Bad=('Bad', 'sum'))\
.eval('Total_Count = Total_Good + Total_Bad')
Upvotes: 1