Reputation: 2508
I am working with dataset Titanic. I want to separate numerical columns from category columns. I try to do this whit this lines of codes:
from pandas.api.types import is_string_dtype
from pandas.api.types import is_numeric_dtype
print("Numeric columns")
for column in dataset.columns:
if is_numeric_dtype(dataset[column]):
print(column)
print("----------------------------------")
print("Category columns")
for column in dataset.columns:
if is_string_dtype(dataset[column]):
print(column)
Output:
Numeric
columns
Unnamed: 0
credit_amount
installment_commitment
residence_since
age
existing_credits
num_dependents
accepted
----------------------------------
Category
columns
checking_status
duration
credit_history
purpose
savings_status
employment
personal_status
other_parties
property_magnitude
other_payment_plans
housing
job
own_telephone
foreign_worker
change_purpose
change_duration
So now I see clearly what is numerical category. Now I want to drop all numerical columns with names columns stored into columns_names
dataset_numerical = dataset.select_dtypes(include=['int64'])
columns_names = dataset_numerical.tolist()
dataset = dataset.drop([columns_names], axis=1)
This is stored into columns_names
['Unnamed: 0',
'credit_amount',
'installment_commitment',
'residence_since',
'age',
'existing_credits',
'num_dependents',
'accepted']
So obviously I made mistake with last line of code so can can anybody help me how to solve this ?
I also try with this lines of codes but again nothing
to_drop = columns_names
to_drop_stripped = [x.strip() for x in to_drop.split(',')]
dataset.drop(columns=to_drop_stripped)
At the end I expect to drop all columns which names are stored into columns_names .
Upvotes: 2
Views: 61
Reputation: 13831
Some minor tweaks are needed on your 2 chunks of codes. It's hard to be sure that this will work for you as I can't replicate exactly your dataset, but I think the below codes will work now.
# Code block 1
dataset_numerical = dataset.select_dtypes(include = ['int64'])
columns_names = dataset_numerical.columns.tolist() # added the .columns
dataset= dataset.drop(columns_names, axis=1) # removed the [] brackets
# Code block 2
to_drop = columns_names
to_drop_stripped = [x.strip() for x in to_drop] # removed .split() at the end
dataset.drop(columns=to_drop_stripped)
Upvotes: 2
Reputation: 42
#Check the dtypes with
dataset.dtypes
#For a list of the columns with strings
print(dataset.select_dtypes(include=object).columns.values)
#Replace object with the dtype you are interested, without " "
Upvotes: 1