Reputation: 335
I am working with a data frame that has mixed data types (numeric and character) and also has a character key as the primary identifier. I'd like to scale and center the numeric variables. I've tried using the scale() function, but it requires all fields to be numeric. When I take just the numeric fields and scale them, I have to drop the character identifier to be able to scale them.
My ideal end state is that I have a data frame with character fields and scaled numeric fields.
I realize this is a newbie question, so please be gentle ;-)
Thanks!
Jim
Upvotes: 21
Views: 28282
Reputation: 79
Really the same thing as proposed by Marius, except mutate_if has been superceded with across:
library(dplyr)
iris %>%
mutate(across(where(is.numeric), scale))
Upvotes: 2
Reputation: 60060
This can be done straightforwardly using dplyr::mutate_if
:
library(dplyr)
iris %>%
mutate_if(is.numeric, scale)
Upvotes: 29
Reputation: 40899
This code below does not need any external library:
# Scale all numeric columns in a data frame.
# df is your data frame
performScaling <- TRUE # Turn it on/off for experimentation.
if (performScaling) {
# Loop over each column.
for (colName in names(df)) {
# Check if the column contains numeric data.
if(class(df[,colName]) == 'integer' | class(df[,colName]) == 'numeric') {
# Scale this column (scale() function applies z-scaling).
df[,colName] <- scale(df[,colName])
}
}
}
Upvotes: 0
Reputation: 6365
Something like this should do what you want:
library(MASS)
ind <- sapply(anorexia, is.numeric)
anorexia[ind] <- lapply(anorexia[ind], scale)
Upvotes: 27