Reputation: 484
I need to forward fill a column in pandas dataframe only when it has one proceeding row with null value. For example:
col
v1
nan
v2
nan
v3
nan
nan
v4
nan
The output I need is:
col
v1
v1
v2
v2
v3
nan
nan
v4
v4
Upvotes: 1
Views: 410
Reputation: 195418
tmp1 = df['col'].shift(fill_value=df['col'][df.index[0]])
tmp2 = df['col'].shift(-1, fill_value=tmp1[tmp1.index[-1]])
m = df['col'].isna() & ~tmp1.isna() & ~tmp2.isna()
df.loc[m, 'col'] = tmp1[m]
print(df)
Prints:
col
0 v1
1 v1
2 v2
3 v2
4 v3
5 NaN
6 NaN
7 v4
8 v4
Upvotes: 2
Reputation: 6392
It's not an elegant solution, but this should work:
import pandas as pd
import numpy as np
df = pd.DataFrame({"col": ["v1", np.nan, "v2", np.nan, "v3", np.nan, np.nan, "v4", np.nan]})
# get indices of NaNs
index = df["col"][df["col"].isna()].index
# get values of non-NaNs
vals = df["col"].copy()[~df["col"].isna()]
# use edited version of https://stackoverflow.com/a/48106843/1862861 to get lists of non-consecutive numbers to use as slices
def ranges(nums):
nums = sorted(set(nums))
gaps = [[s, e] for s, e in zip(nums, nums[1:]) if s+1 < e]
edges = iter(nums[:1] + sum(gaps, []) + nums[-1:])
noncons = []
for ed in list(zip(edges, edges)):
if ed[0] == ed[1]:
noncons.append((ed[0], ed[1]+1))
return noncons
slices = ranges(index)
# loop over values and replace as required
for val, sl in zip(vals, slices):
if sl[0]+1 != sl[1]:
df["col"][sl[0]:sl[1]] = val
print(df)
col
0 v1
1 v1
2 v2
3 v2
4 v3
5 NaN
6 NaN
7 v4
8 v4
Upvotes: 0