R dplyr using all columns again after select

Question

I'm looking for a way to perform an operation on a selection of columns and then continue to work with all columns again. I would also like to keep the order of the columns. The original selected columns are not needed anymore.

My data has row.names, if that helps

library(dplyr)
data(iris)
iris2 <- cbind(first = iris, second = iris)

iris2 %>%
  select(-contains("Species")) %>%
  scale() #%>%
  #deselect() ??

Any ideas on this one? I couldn't find a function like unselect or deselect but I guess I'm missing something obvious?

hendrikvanb · Accepted Answer

This looks like a perfect use-case either for dplyr::mutate_at or dplyr::mutate_if. In both instances below,

the scale() function is only applied to the desired columns
the column order is retained
all column are retained

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

data(iris)

# take iris %>% use mutate_at to target all columns that are not the 'Species'
# column and apply the scale function to those columns
a <- iris %>% 
  mutate_at(vars(-matches('Species')), scale)

head(a)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1   -0.8976739  1.01560199    -1.335752   -1.311052  setosa
#> 2   -1.1392005 -0.13153881    -1.335752   -1.311052  setosa
#> 3   -1.3807271  0.32731751    -1.392399   -1.311052  setosa
#> 4   -1.5014904  0.09788935    -1.279104   -1.311052  setosa
#> 5   -1.0184372  1.24503015    -1.335752   -1.311052  setosa
#> 6   -0.5353840  1.93331463    -1.165809   -1.048667  setosa


# alternatively, take iris %>% use mutate_if to target all numeric columns  and
# apply the scale function to those columns
b <- iris %>% 
  mutate_if(is.numeric, scale)

identical(a, b)
#> [1] TRUE

R dplyr using all columns again after select

Answers (1)

Related Questions