ian_chan
ian_chan

Reputation: 355

matching two dataframe index by month

I have two dataframe, one in weekly frequency and one in monthly frequency:

df_week:
               A   B
2012-01-02    0.1 0.2
2012-01-09    0.4 0.3
2012-01-16    0.4 0.4
2012-01-23    0.3 0.3
2012-01-30    0.2 0.4
2012-02-06    0.1 0.5

Monthly data (empty):

df_monthly
              A   
2012-01-31   NaN 
2012-02-29   NaN 
2012-03-31   NaN

What I want to do is to find out the last week in each month (for example, 2012-01-30 for January 2012), and then look at 52 weeks before the date to regress A over B, and record the entry as df_monthly.loc['2012-01-31', 'A']. What I have so far is loop over df_week:

for i in range(52, len(df_week)-1): 
    if df_week.index[i].month ~= df_week.index[i+1].month: #find out last week of the month
        temp = df_week.iloc[i-52, :]
        regress temp.A on temp.B

I now have the coefficient, sm.OLS.params[1]. But I don't know how to map it to the df_monthly data in an easy way. Theoretically, I need to find index j at df_month where:

 df_month.index[j].month = df_week.index[i].month

Upvotes: 2

Views: 246

Answers (1)

Josh
Josh

Reputation: 2835

If there is only one entry in df_month you want to use (how I understand the question) than you can create boolean mask on the index to return the correct row of df_month.

Both indexes need to be pandas datetime datatype so you may access .month

If you have your regression coefficient C and index i of df_week you can do something like

mask = (df_month.index.month == df_week.iloc[i].name.month)
df_month.loc[mask, 'A'] = C
# you can access row with just df_month[mask] also

Upvotes: 1

Related Questions