Reputation: 3005
What is an efficient way of splitting/returning the categorical columns and numeric columns from the pandas data frame in python?
So far I'm using the below function for finding the categorical columns and numeric columns.
def returnCatNumList(df):
object_cols = list(df.select_dtypes(exclude=['int', 'float', 'int64', 'float64',
'int32', 'float32', 'int16', 'float16']).columns)
numeric_cols = list(df.select_dtypes(include=['int', 'float', 'int64', 'float64',
'int32', 'float32', 'int16', 'float16']).columns)
return object_cols, numeric_cols
I'm looking for an efficient and better approach to do this. Any suggestions or references would be highly appreciated.
Upvotes: 2
Views: 284
Reputation: 3005
We can also use the pandas types API which allows us to interact and manipulate the types of data
def returnCatNumList(df):
object_cols = []
numeric_cols = []
for label, content in df.items():
if pd.api.types.is_string_dtype(content):
numeric_cols.append(label)
else:
object_cols.append(label)
return object_cols, numeric_cols
Example:
iris = sns.load_dataset('iris')
object_cols, numeric_cols = returnCatNumList(iris)
print(object_cols)
print(numeric_cols)
output:
>>> ['sepal_length', 'sepal_width', 'petal_length', 'petal_width']
>>> ['species']
Upvotes: 0
Reputation: 815
You can do this by simply using object dtype
def returnCatNumList(df):
object_cols = df.select_dtypes(include="object").columns.tolist()
numeric_cols = df.select_dtypes(exclude="object").columns.tolist()
return object_cols, numeric_cols
Upvotes: 1
Reputation: 863166
You can simplify your answer by np.number
instead list of numeric dtype
s:
def returnCatNumList(df):
object_cols = list(df.select_dtypes(exclude=np.number).columns)
numeric_cols = list(df.select_dtypes(include=np.number).columns)
return object_cols, numeric_cols
Another idea is for numeric_cols
use Index.difference
:
def returnCatNumList(df):
object_cols = list(df.select_dtypes(exclude=np.number).columns)
numeric_cols = list(df.columns.difference(object_cols, sort=False))
return object_cols, numeric_cols
Upvotes: 2