Vincent Teyssier
Vincent Teyssier

Reputation: 2217

groupby resulting dataframe concatenation

I have a dataset looking like that:

number   day        hour
12       20180101   10
7        20180101   10
8        20180101   11
78       20180101   12

And I would like to get a new dataframe grouping by day,hour, having a sum of number so the result look like :

time         amount
2018010110   19
2018010111    8
2018010112   78

To do that I am writing the following:

new_df['amount'] = df.groupby(['day','hour'])['number'].agg('sum')

The result i get is But then day and hour seems to be sub keys of number. How can I transform new_df so it only have the desired columns ?

If I try to access new_df['day'] or new_df['amount']['day'] I get error that the key does not exist.

If i could access it I would run something like

new_df['time'] = new_df.... .map(str) + new_df... .map(str)

Upvotes: 1

Views: 54

Answers (2)

shivsn
shivsn

Reputation: 7828

IIUC:

In [24]: df['day_hour'] = df['day'].astype(str)+df['hour'].astype(str)

In [27]: wd.groupby(['day_hour'],as_index=False)['number'].sum()
Out[27]: 
     day_hour  number
0  2018010110      19
1  2018010111       8
2  2018010112      78

Upvotes: 4

Zero
Zero

Reputation: 76917

You could use

In [160]: df.groupby(df.day * 100 + df.hour)['number'].sum()
Out[160]:
2018010110    19
2018010111     8
2018010112    78
Name: number, dtype: int64

In [164]: (df.groupby(df.day * 100 + df.hour)['number'].sum()
             .rename_axis('time', 0).reset_index(name='amount'))
Out[164]:
         time  amount
0  2018010110      19
1  2018010111       8
2  2018010112      78

Upvotes: 3

Related Questions