Reputation: 197
I have the following dataframe with start and end dates, as well as a list of defined holidays.
df = pd.DataFrame({'Start': ['2018-01-01 18:47','2018-01-08 06:11','2018-01-12 10:05','2018-02-10 09:22','2018-02-18 14:14','2018-03-08 16:17','2018-03-25 17:35'],
'End': ['2018-01-02 17:00','2018-01-08 17:00','2018-01-16 10:05','2018-02-12 17:00','2018-02-19 14:14','2018-03-12 16:17','2018-03-26 17:00']})
Holidays = [date(2018,1,1),date(2018,1,15),date(2018,2,19),date(2018,3,9),date(2018,5,28),date(2018,7,4),date(2018,9,3),date(2018,11,22),date(2018,11,23),date(2018,12,24),date(2018,12,25)]
How can I go about determining the instances where a Holiday is either in between the respective Start and End Dates or is the same as the End Date?
I'm not interested if the holiday shares the same date as the Start Date.
The result should be as follows:
result = pd.DataFrame({'Start': ['2018-01-01 18:47','2018-01-08 06:11','2018-01-12 10:05','2018-02-10 09:22','2018-02-18 14:14','2018-03-08 16:17','2018-03-25 17:35'],
'End': ['2018-01-02 17:00','2018-01-08 17:00','2018-01-16 10:05','2018-02-12 17:00','2018-02-19 14:14','2018-03-12 16:17','2018-03-26 17:00'],
'InRange': [0,0,1,0,1,1,0]})
Upvotes: 0
Views: 41
Reputation: 323226
This is a O(n*m) solution
l=[any([(z>x)&(z<y) for z in Holidays]) for x , y in zip(df.Start.dt.date,df.End.dt.date)]
#[any((z>x)&(z<y) for z in Holidays) for x , y in zip(df.Start.dt.date,df.End.dt.date)]
l
Out[1089]: [False, False, True, False, False, True, False]
#df['InRange']=l
#df['InRange']=df['InRange'].astype(int)
Upvotes: 1