Reputation: 899
I have a Pandas DataFrame with first and last date columns from which I have constructed a dr
field with a list of dates inside the range:
_id FID first_seen last_seen dr
607abc 1925 2020-11-13 2021-04-22 DatetimeIndex(['2020-11-13', '2020-11-14', '20...])
608abd 1925 2021-03-05 2021-03-07 DatetimeIndex(['2021-03-05', '2021-03-06', '20...])
442xae 1331 2020-05-05 2021-04-22 DatetimeIndex(['2020-05-05', '2021-04022', '20...])
...
For each FID
, for each date that occurs associated with that FID, I need to get the count of _id
on that date. For instance, from the above, as follows:
FID date count(_id)
1925 2020-11-13 44
1925 2020-11-14 46
...
1925 2021-04-22 61
1331 2020-05-05 189
1331 2020-05-06 190
...
I know I need to group by FID first, and get the min of first_seen and the max of last_seen, then I get stuck...
Upvotes: 0
Views: 175
Reputation: 323226
Let us try explode
df.explode('dr').groupby(['FID','dr']).size()
Or we can try value_counts
here
df.explode('dr').value_counts(['FID','dr'])
Upvotes: 1