Reputation: 1807
I think this should be a pretty easy problem, but I can't seem to figure out the answer. Essentially I want to add a prefix to the values of a column, but without using any type of conditional statement. Is this possible?
i.e., I want something that produces the same result as
mtcars %>%
mutate(new_cyl = str_c(case_when(cyl == 6 ~ letters[1],
cyl == 4 ~ letters[2],
cyl == 8 ~ letters[3]), ") ", cyl))
But without using the case_when
or any other conditional.
My expected output would append something (in this case letters) to the values of a column in the order they appear. Here it looks like:
Upvotes: 3
Views: 327
Reputation: 4520
You can use factor()
for the labeling:
mtcars %>%
mutate(
new_cyl = paste(factor(x, unique(x), letters[seq(unique(x))]), cyl, sep = ') ')
)
Output (head):
# mpg cyl disp hp drat wt qsec vs am gear carb new_cyl
# 1 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 a) 6
# 2 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 a) 6
# 3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 b) 4
# 4 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 a) 6
# 5 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 c) 8
# 6 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 a) 6
Generalized as a function:
append_prefix <- function(x, sort = F){
out <- paste(factor(x, unique(x), letters[seq(unique(x))]), x, sep = ') ')
if(sort) out <- paste(factor(x, sort(unique(x)), letters[seq(unique(x))]), x, sep = ') ')
return(out)
}
Upvotes: 1
Reputation: 16832
You can make a small lookup table of prefixes. That way, you know you're matching up cyl values to their prefixes correctly, rather than depending on the order of the data being correct. Then join the lookup to your data and paste.
library(dplyr)
lookup <- data.frame(
cyl = c(6, 4, 8),
prefix = letters[1:3]
)
For the case where you want to create the prefixes based on the order of the cyl
column, rather than hardcoding cyl
values to their prefixes, you could create the lookup table more dynamically, such as:
lookup <- data.frame(
cyl = unique(mtcars$cyl),
prefix = letters[seq_along(unique(mtcars$cyl))]
)
# select is just to get extra columns out of the way to show here
mtcars %>%
select(cyl) %>%
left_join(lookup, by = "cyl") %>%
mutate(new_cyl = paste(prefix, cyl, sep = ") ")) %>%
head()
#> cyl prefix new_cyl
#> 1 6 a a) 6
#> 2 6 a a) 6
#> 3 4 b b) 4
#> 4 6 a a) 6
#> 5 8 c c) 8
#> 6 6 a a) 6
Upvotes: 4
Reputation: 39858
If you want to assign a letter to all "cyl" values:
mtcars %>%
mutate(new_cyl = paste0(letters[cyl], ") ", cyl))
mpg cyl disp hp drat wt qsec vs am gear carb new_cyl
1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4 f) 6
2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4 f) 6
3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1 d) 4
4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1 f) 6
5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2 h) 8
6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1 f) 6
7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4 h) 8
8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2 d) 4
9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2 d) 4
10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4 f) 6
To assign letters based on actual order of "cyl" values:
mtcars %>%
mutate(dup = cumsum(!duplicated(cyl))) %>%
group_by(cyl) %>%
mutate(dup = first(dup),
new_cyl = paste0(letters[dup], ") ", cyl)) %>%
ungroup() %>%
select(-dup)
mpg cyl disp hp drat wt qsec vs am gear carb new_cyl
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
1 21 6 160 110 3.9 2.62 16.5 0 1 4 4 a) 6
2 21 6 160 110 3.9 2.88 17.0 0 1 4 4 a) 6
3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1 b) 4
4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1 a) 6
5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2 c) 8
6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1 a) 6
7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4 c) 8
8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2 b) 4
9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2 b) 4
10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4 a) 6
Here it, first, creates a cumulative sum of non-duplicate "cyl" values. Second, it groups by "cyl". Finally, it takes the first value of cumulative sum of non-duplicate values and assigns it a letter.
Upvotes: 2