Reputation: 13
Hi I have a table of data like below and I want to try do a rolling count that takes the date in the group by and the values of dates prior.
Table of data:
Date | ID |
---|---|
1/1/2020 | 123 |
2/1/2020 | 432 |
2/1/2020 | 5234 |
4/1/2020 | 543 |
5/1/2020 | 645 |
6/1/2020 | 231 |
My desired output is something like this:
Date | count |
---|---|
1/1/2020 | 1 |
2/1/2020 | 3 |
4/1/2020 | 4 |
5/1/2020 | 5 |
6/1/2020 | 6 |
I have tried the following but it doesn't seem to work on how I want it do it.
df[['id','date']].groupby('date').cumcount()
Upvotes: 1
Views: 165
Reputation: 862406
Convert column to datetimes for correct ordering if aggregate GroupBy.size
and add cumulative sum by Series.cumsum
:
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
df = df.groupby('Date').size().cumsum().reset_index(name='count')
print (df)
Date count
0 2020-01-01 1
1 2020-01-02 3
2 2020-01-04 4
3 2020-01-05 5
4 2020-01-06 6
Upvotes: 1