Reputation: 970
I have a multi-dtype series pd.Series
like [100, 50, 0, foo, bar, baz]
when I run pd.Series.str.isnumeric()
I get [NaN, NaN, NaN, False, False, False]
Why is this happening? Shouldn't it return True
for the first three in this series?
Upvotes: 6
Views: 8992
Reputation: 164793
Pandas string methods follow Python methods closely:
str.isnumeric(100) # TypeError
str.isnumeric('100') # True
str.isnumeric('a10') # False
Any type which yields an error will give NaN
. As per the Python docs, str.isnumeric
is only applicable for strings:
str.isnumeric()
Return true if all characters in the string are numeric characters, and there is at least one character, false otherwise.
As per the Pandas docs, pd.Series.str.isnumeric
is equivalent to str.isnumeric
:
Series.str.isnumeric()
Check whether all characters in each string in the Series/Index are numeric. Equivalent tostr.isnumeric()
.
Your series has "object" dtype, this is an all-encompassing type which holds pointers to arbitrary Python objects. These may be a mixture of strings, integers, etc. Therefore, you should expect NaN
values where strings are not found.
To accommodate numeric types, you need to convert to strings explicitly, e.g. given a series s
:
s.astype(str).str.isnumeric()
Upvotes: 13
Reputation: 51175
Using the string accessor is converting your numbers to NaN
, it is happening before you even try to use isnumeric
:
s = pd.Series([100, 50, 0, 'foo', 'bar', 'baz'])
s.str[:]
0 NaN
1 NaN
2 NaN
3 foo
4 bar
5 baz
dtype: object
So the NaN
's remain when you use isnumeric
. Use astype
first instead:
s.astype(str).str.isnumeric()
0 True
1 True
2 True
3 False
4 False
5 False
dtype: bool
Upvotes: 5