Reputation: 145
I need to get the minimum and maximum dates of the two periods where column y is greater than 0.
Dataframe:
Date Y
25/02/2019 0
24/02/2019 1
23/02/2019 1
22/02/2019 0
21/02/2019 1
20/02/2019 1
19/02/2019 1
18/02/2019 0
17/02/2019 0
16/02/2019 0
Perhaps return the results in a list. Using the table above I expect to get the following:
[[23/02/2019, 24/02/2019],[21/02/2019, 19/02/2019]]
How the results are returned is not important. It's getting them. I understand how to get min and max dates if there was only one period where Y > 0 :
min(ua_dates['Date'])
max(ua_dates['Date'])
Trouble is this gives me 19/02/2019 and 24/02/2019 using the table above. Perhaps there is some sort of library out there?
Upvotes: 1
Views: 908
Reputation: 323226
IIUC you need select all Y==1 out , then if they are not continue , you want them into different group, here we using cumsum
ua_dates.loc[df.Y==1,'Date'].groupby(ua_dates.Y.eq(0).cumsum()).agg(['min','max']).values.tolist()
Out[713]: [['23/02/2019', '24/02/2019'], ['19/02/2019', '21/02/2019']]
Upvotes: 2
Reputation: 1208
You just need to mask your data:
mask = ua_dates['Y'] > 0
min(ua_dates[mask]['Date'])
max(ua_dates[mask]['Date'])
Upvotes: 1