Jim Murphy
Jim Murphy

Reputation: 335

R - Scaling numeric values only in a dataframe with mixed types

I am working with a data frame that has mixed data types (numeric and character) and also has a character key as the primary identifier. I'd like to scale and center the numeric variables. I've tried using the scale() function, but it requires all fields to be numeric. When I take just the numeric fields and scale them, I have to drop the character identifier to be able to scale them.

My ideal end state is that I have a data frame with character fields and scaled numeric fields.

I realize this is a newbie question, so please be gentle ;-)

Thanks!

Jim

Upvotes: 21

Views: 28282

Answers (4)

Denis Kazakov
Denis Kazakov

Reputation: 79

Really the same thing as proposed by Marius, except mutate_if has been superceded with across:

library(dplyr)

iris %>%
    mutate(across(where(is.numeric), scale))

Upvotes: 2

Marius
Marius

Reputation: 60060

This can be done straightforwardly using dplyr::mutate_if:

library(dplyr)

iris %>%
    mutate_if(is.numeric, scale)

Upvotes: 29

stackoverflowuser2010
stackoverflowuser2010

Reputation: 40899

This code below does not need any external library:

# Scale all numeric columns in a data frame.
# df is your data frame

performScaling <- TRUE  # Turn it on/off for experimentation.

if (performScaling) {

    # Loop over each column.
    for (colName in names(df)) {

        # Check if the column contains numeric data.
        if(class(df[,colName]) == 'integer' | class(df[,colName]) == 'numeric') {

            # Scale this column (scale() function applies z-scaling).
            df[,colName] <- scale(df[,colName])
        }
    }
}

Upvotes: 0

James King
James King

Reputation: 6365

Something like this should do what you want:

library(MASS)
ind <- sapply(anorexia, is.numeric)
anorexia[ind] <- lapply(anorexia[ind], scale)

Upvotes: 27

Related Questions