oceanbeach96
oceanbeach96

Reputation: 634

Forward fill NaN by id up to last value and 2 more times after last value

Task

I have the following df and want to forward fill the NaN by each id with the previous value. On the last value in each column, I want it to only forward fill 2 more times.

I currently use this:

df.set_index(['id', 'date'], inplace=True)
df = df.ffill().where(df.bfill().notnull()) 

However, it forward fills more than the desired 2 times. Any help would be super!

df

   id      date     compname   fyr
1   1   2016-02-29     a        5.0
2   1   2016-03-31    NaN       NaN
3   1   2016-04-30    NaN       NaN
4   1   2016-05-31     a        5.0
5   1   2016-06-30    NaN       NaN
6   1   2016-07-31    NaN       NaN
7   1   2016-08-31     a        5.0
8   1   2016-09-30    NaN       NaN
9   1   2016-10-31    NaN       NaN
10  1   2016-11-30     a        5.0
11  1   2016-12-31    NaN       NaN
12  1   2017-01-31    NaN       NaN
13  1   2017-02-28    NaN       NaN
14  1   2017-03-31    NaN       NaN
15  2   2016-02-29     b        5.0
16  2   2016-03-31    NaN       NaN
17  2   2016-04-30    NaN       NaN
18  2   2016-05-31     b        5.0
19  2   2016-06-30    NaN       NaN
20  2   2016-07-31    NaN       NaN
21  2   2016-08-31     b        5.0
22  2   2016-09-30    NaN       NaN
23  2   2016-10-31    NaN       NaN
24  2   2016-11-30     b        5.0
25  2   2016-12-31    NaN       NaN
26  2   2017-01-31    NaN       NaN
27  2   2017-02-28    NaN       NaN
28  2   2017-03-31    NaN       NaN

desired df

   id      date     compname   fyr
1   1   2016-02-29     a       5.0
2   1   2016-03-31     a       5.0
3   1   2016-04-30     a       5.0
4   1   2016-05-31     a       5.0
5   1   2016-06-30     a       5.0
6   1   2016-07-31     a       5.0
7   1   2016-08-31     a       5.0
8   1   2016-09-30     a       5.0
9   1   2016-10-31     a       5.0
10  1   2016-11-30     a       5.0
11  1   2016-12-31     a       5.0
12  1   2017-01-31     a       5.0
13  1   2017-02-28    NaN      NaN
14  1   2017-03-31    NaN      NaN
15  2   2016-02-29     b       5.0
16  2   2016-03-31     b       5.0
17  2   2016-04-30     b       5.0
18  2   2016-05-31     b       5.0
19  2   2016-06-30     b       5.0
20  2   2016-07-31     b       5.0
21  2   2016-08-31     b       5.0
22  2   2016-09-30     b       5.0
23  2   2016-10-31     b       5.0
24  2   2016-11-30     b       5.0
25  2   2016-12-31     b       5.0
26  2   2017-01-31     b       5.0
27  2   2017-02-28    NaN      NaN
28  2   2017-03-31    NaN      NaN

Upvotes: 1

Views: 38

Answers (2)

jezrael
jezrael

Reputation: 862511

If need back and forward filling per groups only 2 values add parameter limit to bfill and ffill methods:

f = lambda x: x.bfill(limit=2).ffill(limit=2)
df[["compname","fyr"]] = df.groupby("id")[["compname","fyr"]].apply(f)
print (df)
    id        date compname  fyr
1    1  2016-02-29        a  5.0
2    1  2016-03-31        a  5.0
3    1  2016-04-30        a  5.0
4    1  2016-05-31        a  5.0
5    1  2016-06-30        a  5.0
6    1  2016-07-31        a  5.0
7    1  2016-08-31        a  5.0
8    1  2016-09-30        a  5.0
9    1  2016-10-31        a  5.0
10   1  2016-11-30        a  5.0
11   1  2016-12-31        a  5.0
12   1  2017-01-31        a  5.0
13   1  2017-02-28      NaN  NaN
14   1  2017-03-31      NaN  NaN
15   2  2016-02-29        b  5.0
16   2  2016-03-31        b  5.0
17   2  2016-04-30        b  5.0
18   2  2016-05-31        b  5.0
19   2  2016-06-30        b  5.0
20   2  2016-07-31        b  5.0
21   2  2016-08-31        b  5.0
22   2  2016-09-30        b  5.0
23   2  2016-10-31        b  5.0
24   2  2016-11-30        b  5.0
25   2  2016-12-31        b  5.0
26   2  2017-01-31        b  5.0
27   2  2017-02-28      NaN  NaN
28   2  2017-03-31      NaN  NaN

Upvotes: 1

Henry Yik
Henry Yik

Reputation: 22493

One way is to first do a groupby and bfill, then another groupby with fillna:

df[["compname","fyr"]] = df.groupby("id")["compname","fyr"].bfill()

print (df.groupby("id").apply(lambda d: d.fillna(d.shift(2))))

    id        date compname  fyr
1    1  2016-02-29        a  5.0
2    1  2016-03-31        a  5.0
3    1  2016-04-30        a  5.0
4    1  2016-05-31        a  5.0
5    1  2016-06-30        a  5.0
6    1  2016-07-31        a  5.0
7    1  2016-08-31        a  5.0
8    1  2016-09-30        a  5.0
9    1  2016-10-31        a  5.0
10   1  2016-11-30        a  5.0
11   1  2016-12-31        a  5.0
12   1  2017-01-31        a  5.0
13   1  2017-02-28      NaN  NaN
14   1  2017-03-31      NaN  NaN
15   2  2016-02-29        b  5.0
16   2  2016-03-31        b  5.0
17   2  2016-04-30        b  5.0
18   2  2016-05-31        b  5.0
19   2  2016-06-30        b  5.0
20   2  2016-07-31        b  5.0
21   2  2016-08-31        b  5.0
22   2  2016-09-30        b  5.0
23   2  2016-10-31        b  5.0
24   2  2016-11-30        b  5.0
25   2  2016-12-31        b  5.0
26   2  2017-01-31        b  5.0
27   2  2017-02-28      NaN  NaN
28   2  2017-03-31      NaN  NaN

Upvotes: 1

Related Questions