Reputation: 3805
set.seed(123)
dat <-
data.frame(year_ref = 2000:2004,
www_val1 = sample(5),
www_val2 = sample(5),
www_val3 = sample(5),
sat_val1 = sample(5),
sat_val2 = sample(5),
sat_val3 = sample(5),
ds_val1 = sample(5),
ds_val2 = sample(5),
ds_val3 = sample(5))
I want to scale all columns whose names are provided in another vector. For eg. vector var_names
has ds
and sat
, I want to scale all columns whose name starts with them
var_names <- c("ds", "sat")
library(dplyr)
dat %>%
dplyr::select(contains(var_names)) %>%
dplyr::mutate(scale(., center = T, scale = T))
However, this is creating new columns. Can I implement a solution like below so that I can make changes in the original dataframe only except that I do not want to hardcode column index
dat[, 5:10] <- apply(dat[, 5:10], 2, function(x) scale(x, center = T, scale = T))
Upvotes: 2
Views: 579
Reputation: 4636
library(tidyverse)
set.seed(123)
dat <-
data.frame(year_ref = 2000:2004,
www_val1 = sample(5),
www_val2 = sample(5),
www_val3 = sample(5),
sat_val1 = sample(5),
sat_val2 = sample(5),
sat_val3 = sample(5),
ds_val1 = sample(5),
ds_val2 = sample(5),
ds_val3 = sample(5))
var_names <- c("ds", "sat")
dat %>%
dplyr::mutate_at(vars(starts_with(var_names)), ~scale(., center = T, scale = T))
# year_ref www_val1 www_val2 www_val3 sat_val1 sat_val2 sat_val3 ds_val1 ds_val2 ds_val3
# 1 2000 3 3 1 0.0000000 -0.6324555 -1.2649111 0.6324555 0.6324555 0.0000000
# 2 2001 5 5 3 -1.2649111 0.0000000 0.0000000 -0.6324555 -1.2649111 1.2649111
# 3 2002 2 2 2 0.6324555 0.6324555 0.6324555 0.0000000 1.2649111 -0.6324555
# 4 2003 4 4 5 -0.6324555 -1.2649111 -0.6324555 1.2649111 -0.6324555 -1.2649111
# 5 2004 1 1 4 1.2649111 1.2649111 1.2649111 -1.2649111 0.0000000 0.6324555
Upvotes: 1
Reputation: 9247
In dplyr
version >= 1.0.0 you can use the function across
to apply a function to all the columns that satisfy a certain condition
library(dplyr)
dat %>%
mutate(across(starts_with(var_names), scale))
# year_ref www_val1 www_val2 www_val3 sat_val1 sat_val2 sat_val3 ds_val1 ds_val2 ds_val3
# 1 2000 3 3 2 -1.2649111 0.0000000 -1.2649111 -1.2649111 -0.6324555 -0.6324555
# 2 2001 2 1 3 0.6324555 0.6324555 -0.6324555 0.0000000 -1.2649111 0.0000000
# 3 2002 5 2 1 1.2649111 -0.6324555 0.0000000 0.6324555 0.0000000 0.6324555
# 4 2003 4 5 4 0.0000000 -1.2649111 0.6324555 1.2649111 0.6324555 1.2649111
# 5 2004 1 4 5 -0.6324555 1.2649111 1.2649111 -0.6324555 1.2649111 -1.2649111
Upvotes: 1