Shijith
Shijith

Reputation: 4872

Fill Na in multiple columns with values from another column within the pandas data frame

Pandas version 0.23.4, python version 3.7.1
I have a dataframe df as below

df = pd.DataFrame([[0.1, 2, 55, 0,np.nan],
                   [0.2, 4, np.nan, 1,99],
                   [0.3, np.nan, 22, 5,88],
                   [0.4, np.nan, np.nan, 4,77]],
                   columns=list('ABCDE'))
     A    B     C  D     E
0  0.1  2.0  55.0  0   NaN
1  0.2  4.0   NaN  1  99.0
2  0.3  NaN  22.0  5  88.0
3  0.4  NaN   NaN  4  77.0

I want to replace Na values in columns B and C with value in column `A'.

Expected output is

     A   B      C    D      E 
0   0.1  2.0    55.0   0    NaN 
1   0.2  4.0    0.2    1    99.0 
2   0.3  0.3    22.0   5    88.0 
3   0.4  0.4    0.4    4    77.0

I have tried fillna using fill along axis 0, but its not giving expected output, (its filling from the above column)

df.fillna(method='ffill',axis=0, inplace = True)
    A    B     C   D     E
0  0.1  2.0  55.0  0   NaN
1  0.2  4.0  55.0  1  99.0
2  0.3  4.0  22.0  5  88.0
3  0.4  4.0  22.0  4  77.0  

df.fillna(method='ffill',axis=1, inplace = True)

output: NotImplementedError:

Also tried

df[['B','C']] = df[['B','C']].fillna(df.A)
output:
    A    B     C   D     E
0  0.1  2.0  55.0  0   NaN
1  0.2  4.0   NaN  1  99.0
2  0.3  NaN  22.0  5  88.0
3  0.4  NaN   NaN  4  77.0

Tried to fill all Na's in B and Cwith 0, using inplace, but this also is not giving expected output

df[['B','C']].fillna(0,inplace=True)
output:
     A    B     C  D     E
0  0.1  2.0  55.0  0   NaN
1  0.2  4.0   NaN  1  99.0
2  0.3  NaN  22.0  5  88.0
3  0.4  NaN   NaN  4  77.0

filling 0 to slice of data frame will work if assigned back to the same subset

df[['B','C']] = df[['B','C']].fillna(0)
output:
     A    B     C  D     E
0  0.1  2.0  55.0  0   NaN
1  0.2  4.0   0.0  1  99.0
2  0.3  0.0  22.0  5  88.0
3  0.4  0.0   0.0  4  77.0

1) How to fill na values in columns BandC using values from column A from the given data frame ?
2) Also why is inlace not working when using fillna on a subset of the data frame.
3) How to do ffill along the rows(is it implemented)?

Upvotes: 4

Views: 3461

Answers (1)

jezrael
jezrael

Reputation: 862601

1) How to fill na values in columns BandC using values from column A from the given data frame ?

Because replace by column is not implemented, possible solution is double transpose:

df[['B','C']] = df[['B','C']].T.fillna(df['A']).T
print (df)
     A    B     C  D     E
0  0.1  2.0  55.0  0   NaN
1  0.2  4.0   0.2  1  99.0
2  0.3  0.3  22.0  5  88.0
3  0.4  0.4   0.4  4  77.0

Or:

m = df[['B','C']].isna()
df[['B','C']] = df[['B','C']].mask(m, m.astype(int).mul(df['A'], axis=0))
print (df)
     A    B     C  D     E
0  0.1  2.0  55.0  0   NaN
1  0.2  4.0   0.2  1  99.0
2  0.3  0.3  22.0  5  88.0
3  0.4  0.4   0.4  4  77.0

2) Also why is inlace not working when using fillna on a subset of the data frame.

I think reason is chained assignments, need assign back.

3) How to do ffill along the rows(is it implemented)?

Replace by forward filling working nice, if assign back:

df1 = df.fillna(method='ffill',axis=1)
print (df1)
     A    B     C    D     E
0  0.1  2.0  55.0  0.0   0.0
1  0.2  4.0   4.0  1.0  99.0
2  0.3  0.3  22.0  5.0  88.0
3  0.4  0.4   0.4  4.0  77.0

df2 = df.fillna(method='ffill',axis=0)
print (df2)
     A    B     C  D     E
0  0.1  2.0  55.0  0   NaN
1  0.2  4.0  55.0  1  99.0
2  0.3  4.0  22.0  5  88.0
3  0.4  4.0  22.0  4  77.0

Upvotes: 6

Related Questions