Pandas:Fill null values with last available values and a flag

Question

I am looking for a logic to produce an output/update values in value column based on flag Y. Notice the second N in bold. We won’t be filling values for next two Ys since the last value is N and it’s null. If N has a value we can ffill next Y row.

I have tried using df_latest.loc[(df_latest['flag'] == 'Y'), ‘value’] =df_latest[‘value’].fillna(method='ffill') This logic doesn't cover the scenario when N is null and it forward fills all the preceding the NUll row.

flag    value   new_val

Y           1       1 

Y           2       2

Y           NaN     2

N           3       3

Y           NaN     3

Y           5       5

N           NaN     NaN

Y           NaN     NaN

Y           NaN     NaN

N           6       6

Y           NaN     6

Y           NaN     6

Y           NaN     6

Y           NaN     6

Y           NaN     6

ansev · Accepted Answer

We can use GroupBy.ffill to fill by groups, so whenever flag == N and value is null it will not be filled until value is other than null, to fill only when flag isY you can use the commented code.

blocks = (df['flag'].eq('N') & df['value'].isnull()).cumsum()
df['new_val'] = df['value'].groupby(blocks).ffill()

# if you want fill only if flag is Y
#df['new_val'] = df['value'].fillna(df['value'].groupby(blocks)
#                                              .ffill()          
#                                              .where(df['flag'].eq('Y'))
#                                       )

print(df)

Output

   flag  value  new_val
0     Y    1.0      1.0
1     Y    2.0      2.0
2     Y    NaN      2.0
3     N    3.0      3.0
4     Y    NaN      3.0
5     Y    5.0      5.0
6     N    NaN      NaN
7     Y    NaN      NaN
8     Y    NaN      NaN
9     N    6.0      6.0
10    Y    NaN      6.0
11    Y    NaN      6.0
12    Y    NaN      6.0
13    Y    NaN      6.0
14    Y    NaN      6.0

Pandas:Fill null values with last available values and a flag

Answers (1)

Related Questions