Reputation: 65
very new to r.
I am trying to normalize multiple variables in matrix except the last column which has a categorical factor variable (in this case good/notgood).
I there any way to normalize the data without affecting the categorical column? I have tried to normalize while keeping the categorical column out, but can't seem to be able to add it back again.
minimum <- apply(mywines[,-12],2,min)
maximum <- apply(mywines[,-12],2,max)
mywinesNorm <- scale(mywines[,-12],center=minimum,scale=(maximum-minimum))
I still need the 12th column to build supervised models.
Upvotes: 0
Views: 547
Reputation: 37661
The short version is that you can simply reattach the column using cbind
. However, it is just a little more complicated than that. scale
returns a matrix not a data frame. In order to mix numbers and factors, you need a data.frame, not a matrix. So before the cbind, you will want to convert the scaled matrix back to a data.frame.
mywinesNorm = cbind(as.data.frame(mywinesNorm), mywines[ ,12])
A different approach would be to just change the data in place:
mywines[ ,12] = scale(mywines[ ,12])
Upvotes: 1