Reputation: 89
I've tried using pandas' Holidays functionality against hourly data to return a boolean numpy array where all 24 hours of the holiday return as False. I've made this work with df.apply() but this is just not very efficient. Code below:
import pandas as pd
from pandas.tseries.holiday import Holiday, nearest_workday
from dateutil.relativedelta import MO
from dataclasses import dataclass
dt = pd.date_range(start='1/1/2019', end='12/31/2019', freq='H')
@dataclass
class Custom_Holidays:
# todo: rework; Holiday object has start_date and end_date
labor_day = Holiday('Labor Day', month=9, day=1, offset=pd.DateOffset(weekday=MO(1)))
independence_day = Holiday('Independence Day', month=7, day=4)
holidays = Custom_Holidays()
# this only filters out 1 hour instead of 24 hours
independence_day_mask = ~dt.isin(holidays.independence_day.dates(dt[0], dt[-1]))
labor_day_mask = ~dt.isin(holidays.labor_day.dates(dt[0], dt[-1]))
# tests fail -- this should filter out
assert len(dt) - np.sum(independence_day_mask*1) == 24
assert len(dt) - np.sum(independence_day_mask*1) == 24
I think it has to do with applying a mask against hourly values rather than daily values, but still, I'd think this should work.
Upvotes: 0
Views: 242
Reputation: 2222
Check this out. Hope this helps. Basically here the holiday date is converted into a hourly frequency date range
dt = pd.date_range(start='1/1/2019', end='12/31/2019', freq='H')
class Custom_Holidays(AbstractHolidayCalendar):
# todo: rework; Holiday object has start_date and end_date
rules = [Holiday('Labor Day', month=9, day=1, offset=pd.DateOffset(weekday=MO(1))),
Holiday('Independence Day', month=7, day=4)]
holiday_df = pd.date_range(start=1/1/2019, periods=24, freq='H')
holidays = Custom_Holidays().holidays(dt.min().date(), dt.max().date())
# for the holidays make it as a range of hourly freq
for day in holidays:
holiday_df = holiday_df.append(pd.date_range(day, day + pd.DateOffset(hours=23), freq='H'))
holiday_mask = ~dt.isin(holiday_df)
print(len(dt) - np.sum(holiday_mask*1)) # this will give you 48 (24 + 24 for 2 days as holidays)
Upvotes: 1