Reputation: 879
I often run into the problem of having to recode multiple columns that follow the same structure and saving them into a column with a different name. If I could overwrite them, this would just be one line in dplyr
, but since I also want to keep the original column, I don't know a good solution. Below an illustration.
This would be the long code the output of which I would like to replicate:
library(dplyr)
library(ggplot2)
data("diamonds")
diamonds <- diamonds %>%
mutate(x_char = case_when(x <= 4.5 ~ "low",
x > 4.5 & x < 7 ~ "so-so",
x >= 7 ~ "large",
TRUE ~ as.character(NA)),
y_char = case_when(y <= 4.5 ~ "low",
y > 4.5 & y < 7 ~ "so-so",
y >= 7 ~ "large",
TRUE ~ as.character(NA)),
z_char = case_when(z <= 4.5 ~ "low",
z > 4.5 & z < 7 ~ "so-so",
z >= 7 ~ "large",
TRUE ~ as.character(NA)))
This would be the short code with mutate_at that overwrites the original columns:
library(dplyr)
library(ggplot2)
data("diamonds")
diamonds <- diamonds %>%
mutate_at(vars(x, y, z), ~ case_when(. <= 4.5 ~ "low",
. > 4.5 & . < 7 ~ "so-so",
. >= 7 ~ "large",
TRUE ~ as.character(NA)))
Is there a way to keep the short code with mutate_at but change it in a way that the original columns are kept, and the new ones are saved with a different name? In the example that would mean adding _char
at the end of the original column name and changing the recode according to the embedded formula.
Upvotes: 3
Views: 905
Reputation: 8880
try using across
library(tidyverse)
diamonds %>%
mutate(
across(.cols = c(x, y, z),
.fns = ~case_when(.x <= 4.5 ~ "low",
.x > 4.5 & x < 7 ~ "so-so",
.x >= 7 ~ "large",
TRUE ~ as.character(NA)),
.names = "{.col}_char")
)
#> # A tibble: 53,940 x 13
#> carat cut color clarity depth table price x y z x_char y_char
#> <dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr> <chr>
#> 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43 low low
#> 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31 low low
#> 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31 low low
#> 4 0.290 Premium I VS2 62.4 58 334 4.2 4.23 2.63 low low
#> 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75 low low
#> 6 0.24 Very G~ J VVS2 62.8 57 336 3.94 3.96 2.48 low low
#> 7 0.24 Very G~ I VVS1 62.3 57 336 3.95 3.98 2.47 low low
#> 8 0.26 Very G~ H SI1 61.9 55 337 4.07 4.11 2.53 low low
#> 9 0.22 Fair E VS2 65.1 61 337 3.87 3.78 2.49 low low
#> 10 0.23 Very G~ H VS1 59.4 61 338 4 4.05 2.39 low low
#> # ... with 53,930 more rows, and 1 more variable: z_char <chr>
Created on 2021-03-09 by the reprex package (v1.0.0)
Upvotes: 8