Reputation: 257
y = data.loc[data['column1'] != float('NaN'),'column1']
The code above is still returning rows with NaN values in 'column1'. Not sure what I'm doing wrong.. Please help!
Upvotes: 4
Views: 8549
Reputation: 402483
NaN
, by definition is not equal to NaN
.
In [1262]: np.nan == np.nan
Out[1262]: False
Read up about the mathematical concept on Wikipedia.
Option 1
Using pd.Series.notnull
:
df
column1
0 1.0
1 2.0
2 345.0
3 NaN
4 4.0
5 10.0
6 NaN
7 100.0
8 NaN
y = df.loc[df.column1.notnull(), 'column1']
y
0 1.0
1 2.0
2 345.0
4 4.0
5 10.0
7 100.0
Name: column1, dtype: float64
Option 2
As MSeifert suggested, you could use np.isnan
:
y = df.loc[~np.isnan(df.column1), 'column1']
y
0 1.0
1 2.0
2 345.0
4 4.0
5 10.0
7 100.0
Name: column1, dtype: float64
Option 3
If it's just the one column, call pd.Series.dropna
:
y = df.column1.dropna()
y
0 1.0
1 2.0
2 345.0
4 4.0
5 10.0
7 100.0
Name: column1, dtype: float64
Upvotes: 3