Rodwan Bakkar
Rodwan Bakkar

Reputation: 484

forward fill a column in a dataframe only when there is one proceeding row with null value

I need to forward fill a column in pandas dataframe only when it has one proceeding row with null value. For example:

col

v1
nan
v2
nan
v3
nan
nan
v4
nan

The output I need is:

col

v1
v1
v2
v2
v3
nan
nan
v4
v4

Upvotes: 1

Views: 410

Answers (2)

Andrej Kesely
Andrej Kesely

Reputation: 195418

tmp1 = df['col'].shift(fill_value=df['col'][df.index[0]])
tmp2 = df['col'].shift(-1, fill_value=tmp1[tmp1.index[-1]])

m = df['col'].isna() & ~tmp1.isna() & ~tmp2.isna()
df.loc[m, 'col'] = tmp1[m]

print(df)

Prints:

   col
0   v1
1   v1
2   v2
3   v2
4   v3
5  NaN
6  NaN
7   v4
8   v4

Upvotes: 2

Matt Pitkin
Matt Pitkin

Reputation: 6392

It's not an elegant solution, but this should work:

import pandas as pd
import numpy as np

df = pd.DataFrame({"col": ["v1", np.nan, "v2", np.nan, "v3", np.nan, np.nan, "v4", np.nan]})

# get indices of NaNs
index = df["col"][df["col"].isna()].index

# get values of non-NaNs
vals = df["col"].copy()[~df["col"].isna()]

# use edited version of https://stackoverflow.com/a/48106843/1862861 to get lists of non-consecutive numbers to use as slices
def ranges(nums): 
    nums = sorted(set(nums)) 
    gaps = [[s, e] for s, e in zip(nums, nums[1:]) if s+1 < e] 
    edges = iter(nums[:1] + sum(gaps, []) + nums[-1:])
    noncons = []
    for ed in list(zip(edges, edges)):
        if ed[0] == ed[1]:
            noncons.append((ed[0], ed[1]+1))
    return noncons

slices = ranges(index)

# loop over values and replace as required
for val, sl in zip(vals, slices):
    if sl[0]+1 != sl[1]: 
        df["col"][sl[0]:sl[1]] = val

print(df)

   col
0   v1
1   v1
2   v2
3   v2
4   v3
5  NaN
6  NaN
7   v4
8   v4

Upvotes: 0

Related Questions