Reputation: 1094
In some data I am processing I am encountering data of the type float
, which are filled with 'nan', i.e. float('nan')
.
However checking for it does not work as expected:
float('nan') == float('nan')
>> False
You can check it with math.isnan
, but as my data also contains strings (For example: 'nan', but also other user input), it is not that convenient:
import math
math.isnan(float('nan'))
>> True
math.isnan('nan')
>> TypeError: must be real number, not str
In the ideal world I would like to check if a value is in a list of all possible NaN values, for example:
import numpy as np
if x in ['nan', np.nan, ... ]:
# Do something
pass
Now the question:
How can I still use this approach but also check for float('nan')
values? And why equals float('nan') == float('nan')
False
Upvotes: 16
Views: 34181
Reputation: 22696
pandas
isnull
(or opposite notnull
) handles all the exceptions math.isnan
and np.isnan
do not, such as strings and None
.
import pandas as pd
for x in [float('nan'), 'nan', np.NaN, None, 'None', '']:
pd.isnull(x)
True
False
True
True
False
False
Note pandas is a large import, but if you're already using it (which you may well be if you're doing a lot of data processing), it's handy.
Upvotes: 1
Reputation: 19395
Why not just wrap whatever you pass to math.isnan
with another float
conversion? If it was already a float (i.e. float('nan')
) you just made a "redundant" call:
import math
def is_nan(value):
return math.isnan(float(value))
And this seems to give your expected results:
>>> is_nan(float('nan'))
True
>>> is_nan('nan')
True
>>> is_nan(np.nan)
True
>>> is_nan(5)
False
>>> is_nan('5')
False
This will still raise a ValueError
for non-numeric (except 'nan'
) strings. If that's a problem, you can wrap with try/except
. As long as the float
conversion worked, there is no reason for isnan
to fail. So we are basically catching non-numeric strings that my fail the float
conversion:
def is_nan(value):
try:
return math.isnan(float(value))
except ValueError:
return False
Any non-numeric string is surely not a NaN value so return False
.
Upvotes: 13
Reputation: 117
You can check for NaN value like this,
def isNaN(num):
if num == 'nan':
return True
return num!= num
Upvotes: -2
Reputation: 8407
It's very pedestrian, and a bit ugly, but why not just do the following?
import math
import numpy as np
if math.isnan(x) or x in ['nan', np.nan, ... ]:
# Do something
pass
I want to recommend a Yoda expression but haven't quite worked it out yet.
If you want to sweep everything under the carpet put it in a lambda or function.
Following on from https://stackoverflow.com/a/64090763/1021819, you can try to get the iterator to evaluate any in a lazy fashion. The problem then is that if none of the first conditions evaluates to True
then the math.isnan()
call is executed and can still throw the TypeError
. If you evaluate lazily you can guard the math.isnan()
call with a type check against str
:
fn_list_to_check=[
lambda x: x in ['nan', np.nan, ... ],
lambda x: not isinstance(x, str),
lambda x: math.isnan(x)
]
if any(f(x) for f in fn_list_to_check):
# Do something
pass
Note the absence of square list brackets in the any i.e. any()
not any([])
(who knew?).
I think it's quite brilliant but equally as ugly - choose your poison.
For the second part of the question (why float('nan') != float('nan')
), see
What is the rationale for all comparisons returning false for IEEE754 NaN values?
Upvotes: 4