Reputation: 399
I have a dataframe column containing integers, floating numbers and strings. I want to process this column depending on what type of data is present in a particular record.
Now the problem is that, I am able to separate out integer records by Series.str.isnumeric() call, but floating numbers return False here. How can I separate ints & floats together. Here is a basic code:
import numpy as np
import pandas as pd
d = {'A' : ['1234', '12.16', '1234m']}
df= pd.DataFrame(d)
df.A.str.isnumeric()
I get [True False False] as of now, I expect to get [True, True, False].
Upvotes: 2
Views: 4942
Reputation: 42886
Use pd.to_numeric
with argument errors="coerce"
and check which values come out not NaN
:
pd.to_numeric(df['A'],errors='coerce').notna()
0 True
1 True
2 False
Name: A, dtype: bool
If you want to use str.isnumeric
, pandas does not automatically recognizes the .
as a decimal, so we have to replace it:
df['A'].str.replace('\.', '').str.isnumeric()
0 True
1 True
2 False
Name: A, dtype: bool
If I think ahead and what you want to do, you can write a try except
to convert each element to it's type without losing any rows to NaN
:
def convert_numeric(x):
try:
return pd.to_numeric(x)
except:
return x
df['A'].apply(convert_numeric)
0 1234
1 12.16
2 1234m
Name: A, dtype: object
If we then check the types per value, we see it's mixed type now:
df['A'].apply(convert_numeric).apply(type)
0 <class 'numpy.int64'>
1 <class 'numpy.float64'>
2 <class 'str'>
Name: A, dtype: object
Upvotes: 5
Reputation: 1907
def my_func(x):
try:
float(x)
except ValueError:
return False
return True
df['A'].apply(my_func)
0 True
1 True
2 False
Upvotes: 0