Reputation: 2217
I have a dataset looking like that:
number day hour
12 20180101 10
7 20180101 10
8 20180101 11
78 20180101 12
And I would like to get a new dataframe grouping by day,hour, having a sum of number so the result look like :
time amount
2018010110 19
2018010111 8
2018010112 78
To do that I am writing the following:
new_df['amount'] = df.groupby(['day','hour'])['number'].agg('sum')
The result i get is
But then day and hour seems to be sub keys of number. How can I transform new_df
so it only have the desired columns ?
If I try to access new_df['day']
or new_df['amount']['day']
I get error that the key does not exist.
If i could access it I would run something like
new_df['time'] = new_df.... .map(str) + new_df... .map(str)
Upvotes: 1
Views: 54
Reputation: 7828
IIUC:
In [24]: df['day_hour'] = df['day'].astype(str)+df['hour'].astype(str)
In [27]: wd.groupby(['day_hour'],as_index=False)['number'].sum()
Out[27]:
day_hour number
0 2018010110 19
1 2018010111 8
2 2018010112 78
Upvotes: 4
Reputation: 76917
You could use
In [160]: df.groupby(df.day * 100 + df.hour)['number'].sum()
Out[160]:
2018010110 19
2018010111 8
2018010112 78
Name: number, dtype: int64
In [164]: (df.groupby(df.day * 100 + df.hour)['number'].sum()
.rename_axis('time', 0).reset_index(name='amount'))
Out[164]:
time amount
0 2018010110 19
1 2018010111 8
2 2018010112 78
Upvotes: 3