Pandas - Conditional Column Creation based on the Last working day of the week

Question

I have a data-frame like this,

date       day_of_wk    wk_num  wkend_hol_ind   freq_daily
1/1/2007    Monday        1         H   
1/2/2007    Tuesday       1         H   
1/3/2007    Wednesday     1                       D
1/4/2007    Thursday      1                       D
1/5/2007    Friday        1                       D
1/6/2007    Saturday      1         W   
1/7/2007    Sunday        1         W   
1/8/2007    Monday        2                       D
1/9/2007    Tuesday       2                       D
1/10/2007   Wednesday     2                       D
1/11/2007   Thursday      2                       D
1/12/2007   Friday        2         H     
1/13/2007   Saturday      2         W
1/14/2007   Sunday        2         W

I am trying to create a new column inside the data frame as "freq_weekly".

This column will have a value of "W" for the last value of "D" for each value of "wk_num" and blank for other columns.

for example, 1/5/2017 will have a value "W" and 1/11/2017 will have a value "W". Others will have blanks in it.

Kindly help me with this.

jezrael · Accepted Answer

Idea is create helper mask for consecutive groups by ne, shift and cumsum, then filter only D rows and get last values by Series.duplicated with keep='last', then excluded rows by Series.reindex:

m = df['freq_daily'].eq('D')
mask = m.ne(m.shift()).cumsum()[m].duplicated(keep='last').reindex(df.index, fill_value=True)

If need test more columns for duplicates:

mask = (df.assign(m=m.ne(m.shift()).cumsum())[m]
          .duplicated(['wk_num','freq_daily'],keep='last')
          .reindex(df.index, fill_value=True))

So possible create new column by numpy.where:

df['freq_weekly'] = np.where(mask, '', 'W')
print (df)
         date  day_of_wk  wk_num wkend_hol_ind freq_daily freq_weekly
0    1/1/2007     Monday       1             H        NaN            
1    1/2/2007    Tuesday       1             H        NaN            
2    1/3/2007  Wednesday       1           NaN          D            
3    1/4/2007   Thursday       1           NaN          D            
4    1/5/2007     Friday       1           NaN          D           W
5    1/6/2007   Saturday       1             W        NaN            
6    1/7/2007     Sunday       1             W        NaN            
7    1/8/2007     Monday       2           NaN          D            
8    1/9/2007    Tuesday       2           NaN          D            
9   1/10/2007  Wednesday       2           NaN          D            
10  1/11/2007   Thursday       2           NaN          D           W
11  1/12/2007     Friday       2             H        NaN            
12  1/13/2007   Saturday       2             W        NaN            
13  1/14/2007     Sunday       2             W        NaN

Pandas - Conditional Column Creation based on the Last working day of the week

Answers (1)

Related Questions