pandas fillna based on previous row value

I have a column like this,

valueCount
0.0
nan
2.0
1.0
1.0
1.0
nan
nan
nan
4.0

I want to fillna based on next value(add) or previous value(subtract) available. so the result should be,

valueCount
0.0
**1.0**
2.0
1.0
1.0
1.0
**1.0**
**2.0**
**3.0**
4.0

i know it is very condititonal based, if my previous value is 0 i can add +1 to the nan row, else i should start adding from 0,1,2 and so on.

i can do this algorithm in simple list of python, but in pandas is there any simple way ?

Upvotes: 1

Views: 2963

Answers (2)

JHBonarius
JHBonarius

Reputation: 11261

In this case you could also use interpolation and rounding down. But this will not work for all data sets.

Consider:

import pandas as pd
import numpy as np
s = pd.Series([0, np.nan, 2, 1, 1, 1, np.nan, np.nan, np.nan, 4])

Then np.floor(s.interpolate()) gives

0    0.0
1    1.0
2    2.0
3    1.0
4    1.0
5    1.0
6    1.0
7    2.0
8    3.0
9    4.0
dtype: float64

Upvotes: 1

jezrael
jezrael

Reputation: 862541

You can use:

a = df['valueCount'].isnull()
b = a.cumsum()
c = df['valueCount'].bfill()
d = c + (b-b.mask(a).bfill().fillna(0).astype(int)).sub(1)
df['valueCount'] =  df['valueCount'].fillna(d)
print (df)

   valueCount
0         0.0
1         1.0
2         2.0
3         1.0
4         1.0
5         1.0
6         1.0
7         2.0
8         3.0
9         4.0

Detail + explanation:

#back filling NaN values
x = df['valueCount'].bfill()
#compare by NaNs
a = df['valueCount'].isnull()
#cumulative sum of mask
b = a.cumsum()
#replace Trues to NaNs
c = b.mask(a)
#forward fill NaNs
d = b.mask(a).bfill()
#First NaNs to 0 and cast to integers
e = b.mask(a).bfill().fillna(0).astype(int)
#add to backfilled Series cumulative sum and subtract from cumulative sum Series, 1
f = x + b - e - 1
#replace NaNs by Series f
g = df['valueCount'].fillna(f)
df = pd.concat([df['valueCount'], x, a, b, c, d, e, f, g], axis=1, 
               keys=('orig','x','a','b','c','d','e', 'f', 'g'))
print (df)
   orig    x      a  b    c    d  e    f    g
0   0.0  0.0  False  0  0.0  0.0  0 -1.0  0.0
1   NaN  2.0   True  1  NaN  1.0  1  1.0  1.0
2   2.0  2.0  False  1  1.0  1.0  1  1.0  2.0
3   1.0  1.0  False  1  1.0  1.0  1  0.0  1.0
4   1.0  1.0  False  1  1.0  1.0  1  0.0  1.0
5   1.0  1.0  False  1  1.0  1.0  1  0.0  1.0
6   NaN  4.0   True  2  NaN  4.0  4  1.0  1.0
7   NaN  4.0   True  3  NaN  4.0  4  2.0  2.0
8   NaN  4.0   True  4  NaN  4.0  4  3.0  3.0
9   4.0  4.0  False  4  4.0  4.0  4  3.0  4.0

Upvotes: 6

Related Questions