Reputation: 4872
Pandas version 0.23.4
, python version 3.7.1
I have a dataframe df as below
df = pd.DataFrame([[0.1, 2, 55, 0,np.nan],
[0.2, 4, np.nan, 1,99],
[0.3, np.nan, 22, 5,88],
[0.4, np.nan, np.nan, 4,77]],
columns=list('ABCDE'))
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 NaN 1 99.0
2 0.3 NaN 22.0 5 88.0
3 0.4 NaN NaN 4 77.0
I want to replace Na values in columns B
and C
with value in column `A'.
Expected output is
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 0.2 1 99.0
2 0.3 0.3 22.0 5 88.0
3 0.4 0.4 0.4 4 77.0
I have tried fillna using fill
along axis 0
, but its not giving expected output, (its filling from the above column)
df.fillna(method='ffill',axis=0, inplace = True)
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 55.0 1 99.0
2 0.3 4.0 22.0 5 88.0
3 0.4 4.0 22.0 4 77.0
df.fillna(method='ffill',axis=1, inplace = True)
output: NotImplementedError:
Also tried
df[['B','C']] = df[['B','C']].fillna(df.A)
output:
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 NaN 1 99.0
2 0.3 NaN 22.0 5 88.0
3 0.4 NaN NaN 4 77.0
Tried to fill all Na's in B
and C
with 0
, using inplace
, but this also is not giving expected output
df[['B','C']].fillna(0,inplace=True)
output:
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 NaN 1 99.0
2 0.3 NaN 22.0 5 88.0
3 0.4 NaN NaN 4 77.0
filling 0
to slice of data frame will work if assigned back to the same subset
df[['B','C']] = df[['B','C']].fillna(0)
output:
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 0.0 1 99.0
2 0.3 0.0 22.0 5 88.0
3 0.4 0.0 0.0 4 77.0
1) How to fill na values in columns B
andC
using values from column A
from the given data frame ?
2) Also why is inlace not working when using fillna on a subset of the data frame.
3) How to do ffill
along the rows(is it implemented)?
Upvotes: 4
Views: 3461
Reputation: 862601
1) How to fill na values in columns BandC using values from column A from the given data frame ?
Because replace by column is not implemented, possible solution is double transpose:
df[['B','C']] = df[['B','C']].T.fillna(df['A']).T
print (df)
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 0.2 1 99.0
2 0.3 0.3 22.0 5 88.0
3 0.4 0.4 0.4 4 77.0
Or:
m = df[['B','C']].isna()
df[['B','C']] = df[['B','C']].mask(m, m.astype(int).mul(df['A'], axis=0))
print (df)
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 0.2 1 99.0
2 0.3 0.3 22.0 5 88.0
3 0.4 0.4 0.4 4 77.0
2) Also why is inlace not working when using fillna on a subset of the data frame.
I think reason is chained assignments, need assign back.
3) How to do ffill along the rows(is it implemented)?
Replace by forward filling working nice, if assign back:
df1 = df.fillna(method='ffill',axis=1)
print (df1)
A B C D E
0 0.1 2.0 55.0 0.0 0.0
1 0.2 4.0 4.0 1.0 99.0
2 0.3 0.3 22.0 5.0 88.0
3 0.4 0.4 0.4 4.0 77.0
df2 = df.fillna(method='ffill',axis=0)
print (df2)
A B C D E
0 0.1 2.0 55.0 0 NaN
1 0.2 4.0 55.0 1 99.0
2 0.3 4.0 22.0 5 88.0
3 0.4 4.0 22.0 4 77.0
Upvotes: 6