Reputation: 809
I'm trying to fill some NaN values in one column of my dataframe. From what I understand from the docs, I should be able to pass a Pandas series to fillna
, and Pandas will then fill my NaN's with the Series I provided.
The code is like this:
XTrain_pd[class_name] = XTrain_pd[class_name].fillna(pd.Series(train_pred))
So fill in the NaN values based on the values from train_pred
.
I made sure that the length of train_pred
and the number of NaN's to be filled is the same:
print(XTrain_pd[class_name].isna().sum(),print(train_pred.shape))
This outputs:
(9,)
9 None
I also printed out XTrain_pd
before and after using fillna on the NaN values.
Left image is before fillna
, right image is after fillna
.
Some mysterious things happen here. Firstly, only one NaN value is imputed, in row #6. Secondly, my pd.NA
values get converted to np.nan
values. What is going on here?
Upvotes: 0
Views: 778
Reputation: 6667
TL;DR Use
.loc()
to filter the nan values and replace with the predictions df.loc[df.class_name.isna(), 'class_name'] = train_pred
Consider a dataframe with two null values at index 3 and 9
d = {
'col_str': ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'],
'col_float': [1, 2, 3, np.nan, 5, 6, 7, 8, 9, np.nan]
}
df = pd.DataFrame(d)
df
>>>
col_str col_float
0 a 1.0
1 b 2.0
2 c 3.0
3 d NaN
4 e 5.0
5 f 6.0
6 g 7.0
7 h 8.0
8 i 9.0
9 j NaN
if you want to replace null values with the predictions train_pred
, just filter the Nan
values on col_float
and replace it with the predictions.
train_pred = [4.0, 10.0]
df.loc[df.col_float.isna(), 'col_float'] = train_pred
If you were to use fillna()
you wourld need to specify each value for each index of the Series.
Upvotes: 1