Reputation:
If you have a list of dates based on a condition ex:
df
date random_values
2020-03-01 1
2020-03-02 2
2020-03-03 2
2020-04-01 8
2020-04-02 9
2020-04-03 10
2020-05-01 7
2020-05-02 8
2020-05-03 10
2020-05-04 20
ex condition: df['date'].where(df.random_values >= 8)
which would produce the following dates:
2020-04-01
2020-04-02
2020-04-03
2020-05-02
2020-05-03
2020-05-04
While this amount of dates is fine for example, my data had hundreds of dates. I want to cut down the dates so it produces an output similar to this:
2020-04-01: 2020-04-03
2020-05-01: 2020-05-04
I am just trying to cut down the number of dates and make them easily into date ranges. Any easy way to do this?
Upvotes: 0
Views: 328
Reputation: 153460
Try this also:
df = pd.read_clipboard()
df['date'] = pd.to_datetime(df['date'])
s = df.loc[df['random_values'] >= 8, 'date']
grp = (s.diff() != pd.Timedelta(days=1)).cumsum()
s.groupby(grp).agg(lambda x: f"{x.iloc[0].strftime('%Y-%m-%d')}: {x.iloc[-1].strftime('%Y-%m-%d')}")
Output:
date
1 2020-04-01: 2020-04-03
2 2020-05-02: 2020-05-04
Name: date, dtype: object
Upvotes: 1