Reputation: 23
I am looking for a logic to produce an output/update values in value column based on flag Y. Notice the second N in bold. We won’t be filling values for next two Ys since the last value is N and it’s null. If N has a value we can ffill next Y row.
I have tried using df_latest.loc[(df_latest['flag'] == 'Y'), ‘value’] =df_latest[‘value’].fillna(method='ffill') This logic doesn't cover the scenario when N is null and it forward fills all the preceding the NUll row.
flag value new_val
Y 1 1
Y 2 2
Y NaN 2
N 3 3
Y NaN 3
Y 5 5
N NaN NaN
Y NaN NaN
Y NaN NaN
N 6 6
Y NaN 6
Y NaN 6
Y NaN 6
Y NaN 6
Y NaN 6
Upvotes: 1
Views: 774
Reputation: 30930
We can use GroupBy.ffill
to fill by groups,
so whenever flag == N
and value
is null it will not be filled until value is other than null,
to fill only when flag isY
you can use the commented code.
blocks = (df['flag'].eq('N') & df['value'].isnull()).cumsum()
df['new_val'] = df['value'].groupby(blocks).ffill()
# if you want fill only if flag is Y
#df['new_val'] = df['value'].fillna(df['value'].groupby(blocks)
# .ffill()
# .where(df['flag'].eq('Y'))
# )
print(df)
Output
flag value new_val
0 Y 1.0 1.0
1 Y 2.0 2.0
2 Y NaN 2.0
3 N 3.0 3.0
4 Y NaN 3.0
5 Y 5.0 5.0
6 N NaN NaN
7 Y NaN NaN
8 Y NaN NaN
9 N 6.0 6.0
10 Y NaN 6.0
11 Y NaN 6.0
12 Y NaN 6.0
13 Y NaN 6.0
14 Y NaN 6.0
Upvotes: 1