Reputation: 2570
How can I check if a column is a string, or another type (e.g. int or float), even though the dtype is object?
(Ideally I want this operation vectorised, and not applymap
checking every row...)
import io
# American post code
df1_str = """id,postal
1,12345
2,90210
3,"""
df1 = pd.read_csv(io.StringIO(df1_str))
df1["postal"] = df1["postal"].astype("O") # is an object (of type float due to the null row 3)
# British post codes
df2_str = """id,postal
1,EC1
2,SE1
3,W2"""
df2 = pd.read_csv(io.StringIO(df2_str))
df2["postal"] = df2["postal"].astype("O") # is an object (of type string)
Both df1
and df2
return object
when doing df["postal"].dtype
df2
has .str
methods, e.g. df2["postal"].str.lower()
, but df1
doesn't.df1
can have mathematical operations done to it, e.g. df1 * 2
This is different to other SO questions. who ask if there are strings inside a column (and not the WHOLE column). e.g:
Upvotes: 3
Views: 2281
Reputation: 4761
You can use pandas.api.types.infer_dtype
:
>>> pd.api.types.infer_dtype(df2["postal"])
'string'
>>> pd.api.types.infer_dtype(df1["postal"])
'floating'
From the docs:
Efficiently infer the type of a passed val, or list-like array of values. Return a string describing the type.
Upvotes: 7