Reputation: 1010
I have daily data and a loop that defines every third Friday in a month and then changes the value of a column to 2 for the 20 days from and to that third Friday. However, the marking only works for the days AFTER the third Friday not before. I do not get it why. My dataframe "merged" is the following:
Date ID Window
01/01/2000 1 0
01/01/2000 1 0
02/01/2000 2 0
02/01/2000 2 0
The code so far is the following:
#Get third friday in a month Friday:
c = calendar.Calendar(firstweekday=calendar.SUNDAY)
year = 2000; month = 3
monthcal = c.monthdatescalendar(year,month)
third_friday = [day for week in monthcal for day in week if \
day.weekday() == calendar.FRIDAY and \
day.month == month][2]
#Loop through dates to change the window column:
for beg in pd.date_range("2000-01-01", "2017-05-01"):
beg= third_friday
merged["window"].loc[beg: beg + pd.to_timedelta(20,"D")] = 2
merged["window"].loc[beg: beg - pd.to_timedelta(20,"D")] = 2
#repeat the same for the next Fridays:
if month==12:
year=year+1
month=0
if year>=2017 and month>=3:
break
month = month +3
monthcal = c.monthdatescalendar(year,month)
third_friday = [day for week in monthcal for day in week if \
day.weekday() == calendar.FRIDAY and \
day.month == month][2]
When I run this code, I do not get the window column to 2 BEFORE the third Friday. Only the days 20 days after the third Friday are changed to 2. Does anybody know what am I doing wrong?
Upvotes: 0
Views: 105
Reputation: 7058
easiest would be to define a method to calculate the 3rd friday of the month, given a year and month. Either use your method with calendar
, or something like this might work too
def third_friday_of(year, month):
pd.DatetimeIndex(start = '%i/%i/15' % (year, month, ), end='%i/%i/21' % (year, month, ), freq='d')
return daterange[daterange.weekday == 4][0]
this returns a pandas.Timestamp
, but that's a subclass of datetime.datetime
, so should pose no further problems in your program
I also defined a seperate method to do the actual changing of the DataFrame
, with the interval and window as parameters
def process_dataframe(df, begin_year, begin_month, end_year, end_month, interval_months=3, window=20):
end_month = min(end_month + 1, 12)
dates = pd.DatetimeIndex(start = '%i/%i' % ( begin_year, begin_month,), end='%i/%i' % (end_year, end_month), freq='%im' % interval_months)
for d in dates:
third_friday = third_friday_of(d.year, d.month)
# print(d, third_friday)
df.loc[third_friday - pd.Timedelta(window, unit='d') : third_friday 2 pd.Timedelta(window, unit='d'), 'Window'] = 2
The reason it might not have worked for you was merged["window"].loc[beg: beg - pd.to_timedelta(20,"D")] = 2
should have been merged["window"].loc[beg - pd.to_timedelta(20,"D"):beg] = 2
merged["window"].loc[beg: beg + pd.to_timedelta(20,"D")] = 2
in itself has a second problem. With merged["window"]
you ask for a Series, but it's not 100% clear or deterministic whether you get a view or a copy. Better is to do this in 1 .loc
like in my code
Upvotes: 1