jarge
jarge

Reputation: 185

How to groupby datetime data by ID and time period?

So I have this kind of data. Basically there are bunch of people do something and we keep track of them. So I am interested in one of the activities, i.e running. ID - person's id, date - date, DONE - Yes if they run that day. I want to find out in certain periods how many times people did that activity.

ID             DATE              DONE  
1           2009-1-3             Yes
1           2009-1-15            Yes
2           2009-9-3             Yes
3           2009-10-7            No
1           2009-12-1            Yes
3           2009-10-8            Yes
2           2009-10-21           Yes

I want something like this, not necessarily exactly like this.

ID          PERIOD              COUNT
1           2009-1              2
            2009-12             1
2           2009-9              1
            2009-10             1
3           2009-10             1

Thanks in advance!

Upvotes: 1

Views: 197

Answers (1)

jezrael
jezrael

Reputation: 863156

If need count also No values like 0 aggregate sum with boolean mask, for month periods use Series.dt.to_period:

df1 = ((df['DONE'] == 'Yes')
        .groupby([df['ID'], pd.to_datetime(df['DATE']).dt.to_period('m')])
        .sum()
        .astype(int)
        .reset_index(name='COUNT'))
print (df1)
   ID     DATE  COUNT
0   1  2009-01      2
1   1  2009-12      1
2   2  2009-09      1
3   2  2009-10      1
4   3  2009-10      1

Or you can first filter only rows with Yes and aggregate counts by GroupBy.size:

df1 = (df[df['DONE'] == 'Yes']
        .groupby(['ID', pd.to_datetime(df['DATE']).dt.to_period('m')])
        .size()
        .reset_index(name='COUNT'))
print (df1)
   ID     DATE  COUNT
0   1  2009-01      2
1   1  2009-12      1
2   2  2009-09      1
3   2  2009-10      1
4   3  2009-10      1

Upvotes: 1

Related Questions