Reputation: 45
I have the following pandas DataFrame example. I try to to have sum of some spesific rows. I have researched how to carry out, however I could not find the solution. Could you give a direction, please? The example is as below. I thought that I can apply group by and sum but there is column (Value_3) that I would not like to sum of these, just keeping same. Value 3 is constant value, shaped due to Machine and Shift value.
data = {'Machine':['Mch_1', 'Mch_1', 'Mch_1', 'Mch_1', 'Mch_2', 'Mch_2'], 'Shift':['Day', 'Day', 'Night', 'Night', 'Night', 'Night'], 'Value_1':[1, 2, 0, 0, 1, 3], 'Value_2':[0, 2, 2, 1, 3, 0], 'Value_3':[5, 5, 2, 2, 6, 6]}
df = pd.DataFrame(data)
Output:
Mch_1__Day__1__0__5
Mch_1__Day__2__2__5
Mch_1__Night_0__2__2
Mch_1__Night_0__1__2
Mch_2__Night_1__3__6
Mch_2__Night_3__0__6
What I would like to have is like as showed in dataframe.
expected = {'Machine':['Mch_1', 'Mch_1', 'Mch_2'], 'Shift':['Day', 'Night', 'Night'], 'Value_1':[3, 0, 4], 'Value_2':[2, 3, 3]}
df_expected = pd.DataFrame(expected)
df_expected
Output:
Mch_1__Day__3__2__5
Mch_1__Night_0__3__2
Mch_2__Night_4__3__6
Thank you very much.
Upvotes: 1
Views: 54
Reputation: 862691
First idea is pass dictionary for aggregate functions, for last column is possible use first
or last
function:
d = {'Value_1':'sum','Value_2':'sum','Value_3':'first'}
df1 = df.groupby(['Machine','Shift'], as_index=False).agg(d)
If want more dynamic solution it means sum all columns without Value_3
create dyctionary by all columns without specified in list with dict.from_keys
and Index.difference
:
d = dict.fromkeys(df.columns.difference(['Machine','Shift', 'Value_3']), 'sum')
d['Value_3'] = 'first'
df1 = df.groupby(['Machine','Shift'], as_index=False).agg(d)
print (df1)
Machine Shift Value_1 Value_2 Value_3
0 Mch_1 Day 3 2 5
1 Mch_1 Night 0 3 2
2 Mch_2 Night 4 3 6
Upvotes: 3