Krissh
Krissh

Reputation: 357

normalization of categorical variable

I have a dataset which contains gender as Male and Female. I have converted male to 1 and female to 0 using pandas functionality which has now data type int8. now I wanted to normalize columns such as weight and height. So what should be done with the gender column: should it be normalized or not. I am planning to use it in for a linear regression.

Upvotes: 6

Views: 8757

Answers (1)

Tim
Tim

Reputation: 10709

So I think you are mixing up normalization with standardization.

Normalization:

rescales your data into a range of [0;1]

Standardization:

rescales your data to have a mean of 0 and a standard deviation of 1.

Back to your question:

For your gender column your points are already ranging between 0 and 1. Therefore your data is already "normalized". So your question should be if you can standarize your data and the answer is: yes you could, but it doesn't really make sense. This question was already discussed here: Should you ever standardise binary variables?

Upvotes: 9

Related Questions