Aang
Aang

Reputation: 1

Can we standardize a numerical column which actually is categorical?

I have the House Prices - Advanced Regression Techniques Data set. I need to do Lasso and Ridge Regularization on it. I saved the train data in the variable named house. Typed the following code:

house.info()

Got this output: enter image description here

There are columns in this data set which are numerical(int64 and float 64) but they actually are categorical(both ordinal and nominal).

I wanted to ask whether I can standardize these categorical variables or should I first convert all these variables into type "object" using house[col_name]=house[col_name].astype(str) and then do one- hot encoding on these variables and standardize the rest of the numerical columns?

Upvotes: 0

Views: 485

Answers (1)

AlSub
AlSub

Reputation: 1155

When a column is cardinal it is possible to apply one-hot-encoding, in this way the categorical columns can be vectorized in a binary way for each category.

import pandas as pd


raw_df= pd.get_dummies(data=raw_df, 
                       cardinal_features=['col1', 'col2', 'col3'], 
                       prefix=['feature1_', 'feature2_',  'feature3_'])

Upvotes: 0

Related Questions