Reputation: 95
I am trying to get the frequency of days grouped by its ID. So I would like to know:
So it should be look like this:
On top of that I tried the following:
number = data.groupby(by=['ID, Date_Time']).size().unstack(fill_value=0)
but this on gives me a different data frame that I can't add it to the original on "data".
Upvotes: 0
Views: 998
Reputation: 2780
I have created a simple data sample as you have only provided a screenshot, you need to convert the date to datetime format and then group according to id and pd.Grouper
according to distinct days (D). For total all you need is groupby
and transform the count of each group.
data = [[1, "2010-01-01 12:21:46"],[1, "2010-01-01 12:22:46"],[1, "2010-04-02 12:21:46"],[1, "2010-06-06 12:21:46"],[2, "2010-06-06 12:21:46"],[2, "2010-06-06 19:21:46"],[2, "2010-09-06 12:21:46"]]
import pandas as pd
df = pd.DataFrame(data,columns=("id","date_"))
df['date_']= pd.to_datetime(df['date_'],format='%Y-%m-%d %H:%M:%S')
df['freq'] = df.groupby(['id', pd.Grouper(key='date_', freq='D')])['id'].transform('count')
df['total'] = df.groupby('id')['id'].transform('count')
Final ouptput will look like this:
id date_ freq total 0 1 2010-01-01 12:21:46 2 4 1 1 2010-01-01 12:22:46 2 4 2 1 2010-04-02 12:21:46 1 4 3 1 2010-06-06 12:21:46 1 4 4 2 2010-06-06 12:21:46 2 3 5 2 2010-06-06 19:21:46 2 3 6 2 2010-09-06 12:21:46 1 3
Upvotes: 3