Calculate previous occurence

Question

df  month order  customer  
0   Jan    yes       020    
1   Feb    yes       041   
2   April  no        020  
3   May    no        020

Is there a way to calculate the last month a customer ordered if order = no? Expected Output

df  month order   customer  last_order
0   Jan    yes       020    
1   Feb    yes       041   
2   April  no        020     Jan
3   May    no        020     Jan

Ch3steR · Accepted Answer

You can df.groupby, and pd.Series.eq to check if value is yes, then use pd.Series.where and use pd.Series.ffill, then mask using pd.Series.mask

def func(s):
    m = s['order'].eq('yes')
    f = s['month'].where(m).ffill()
    return f.mask(m)

df['last_order'] = df.groupby('customer', group_keys=False).apply(func)

   month order customer last_order
0    Jan   yes      020        NaN
1    Feb   yes      041        NaN
2  March    no      020        Jan

Explanation

What happens in each of the group after groupby is the below, for example consider group where customer is 020

  month order
0   jan   yes
1   apr    no
2   may    no
3   jun   yes
4   jul    no

m = df['order'].eq('yes') # True where `order` is 'yes'
f = df['month'].where(m)#.ffill()
f
0    jan # ---> \
1    NaN         \ #`jan` and `jun` are visible as 
2    NaN         / # they were the months with `order` 'yes'
3    jun # ---> /
4    NaN
Name: month, dtype: object
# If you chain the above with with `ffill` it would fill the NaN values.

f = df['month'].where(m).ffill()
f
0    jan
1    jan # filled with valid above value i.e Jan
2    jan # filled with valid above value i.e Jan
3    jun
4    jun # filled with valid above value i.e Jun
Name: month, dtype: object

f.mask(m) # works opposite of `pd.Series.where`

0    NaN # --->\
1    jan        \ # Marked values `NaN` where order was `yes`.
2    jan        /
3    NaN # --->/
4    jun
Name: month, dtype: object

Calculate previous occurence

Answers (2)

Explanation

Related Questions