Reputation: 143
I recently moved from common dataframe manipulation in R to the tidyverse. But I got a problem regarding scaling of columns with the scale()
function.
My data consists of columns of whom some are numerical and some categorical features. Also the last column is the y value of data. So I want to scale all numerical columns but not the last column.
With the select()
function i am able to write a very short line of code and select all my numerical columns that need to be scaled if i add the ends_with("...")
argument. But I can't really make use of that with scaling. There I have to use transmute(feature1=scale(feature1),feature2=scale(feature2)...)
and name each feature individually. This works fine but bloats up the code.
So my question is:
Is there a smart solution to manipulate column by column without the need to address every single column name with transmute?
I imagine something like:
transmute(ends_with("...")=scale(ends_with("..."),featureX,featureZ)
(well aware that this does not work)
Many thanks in advance
Upvotes: 1
Views: 1533
Reputation: 26343
library(tidyverse)
data("economics")
# add variables that are not numeric
economics[7:9] <- sample(LETTERS[1:10], size = dim(economics)[1], replace = TRUE)
# add a 'y' column (for illustration)
set.seed(1)
economics$y <- rnorm(n = dim(economics)[1])
economics_modified <- economics %>%
select(-y) %>%
transmute_if(is.numeric, scale) %>%
add_column(y = economics$y)
If you want to keep those columns that are not numeric replace transmute_if
with modify_if
. (There might be a smarter way to exclude column y
from being scaled.)
Upvotes: 2