Reputation: 679
I have a data frame that I plan building a histogram with.
The data frame contains the following values.
starttime hour
1 7/01/2015 0
2 7/01/2015 0
3 7/01/2015 3
4 7/01/2015 3
5 7/01/2015 12
I want to have the resulting data frame.
starttime hour frequency
1 7/01/2015 0 2
2 7/01/2015 3 2
3 7/01/2015 12 1
What I have done so far
df_values = Df[['starttime','hour']]
values = df_values.groupby(['starttime'])
grouped = values.aggregate(np.sum)
Output I'm getting
hour
starttime
6/01/2015 0000000000000000000000000000000000000000000000...
6/02/2015 0000000000000000000000000000000000000000000000...
6/03/2015 0000000000000000000000000000000000000000000000...
6/04/2015 NaN
6/05/2015 435211
Any help is greatly appreciated. Thanks.
Upvotes: 0
Views: 75
Reputation: 1596
df['freq'] = 1
df.groupby(['starttime','hour', as_index=False]).count()
Upvotes: 0
Reputation: 402323
Use groupby
+ size
/count
-
c = df.columns.tolist() # c = ['starttime', 'hour']
df.groupby(c).size().reset_index(name='frequency')
Or,
df.groupby(c).hour.count().reset_index(name='frequency')
starttime hour frequency
0 7/01/2015 0 2
1 7/01/2015 3 2
2 7/01/2015 12 1
Upvotes: 2