Pandas: how to dynamically fill NaN?

Question

I have a dataset with lots of NaN values, and I would like to fill it based on other column's value. Here is an example.

  Ind Init Desc
   1   A   Apple
   2   A   Apple
   3   A   NaN
   4   B   NaN
   5   B   Banana
   6   B   Banana
   7   C   Cherry
   8   C   NaN
   9   C   Cherry
   10  D   NaN
   11  D   NaN
   12  D   NaN
   13  A   NaN
   14  A   NaN
   15  A   Apple

I cannot just simply use df.fillna('apple') because it has to be dynamic. I also cannot use neither of (method='ffill') and (method='bfill') because, in case of A, it should be ffill, and in case of B, it should be bfill. Also in case of D, it should be saying 'No fruit description available!'

You may assume there is no missing Init, and there is only one fruit description per one unique Init.

What would be the best way to handle this case?

anky · Accepted Answer

you can use something like:

df['Desc1']=(df.groupby('Init')['Desc'].apply
         (lambda x: x.ffill().bfill()).fillna('No fruit description available!'))
print(df)

    Ind Init    Desc                            Desc1
0     1    A   Apple                            Apple
1     2    A   Apple                            Apple
2     3    A     NaN                            Apple
3     4    B     NaN                           Banana
4     5    B  Banana                           Banana
5     6    B  Banana                           Banana
6     7    C  Cherry                           Cherry
7     8    C     NaN                           Cherry
8     9    C  Cherry                           Cherry
9    10    D     NaN  No fruit description available!
10   11    D     NaN  No fruit description available!
11   12    D     NaN  No fruit description available!
12   13    A     NaN                            Apple
13   14    A     NaN                            Apple
14   15    A   Apple                            Apple

Pandas: how to dynamically fill NaN?

Answers (2)

Related Questions