Mevve
Mevve

Reputation: 159

Change multiple values in column in a tidyverse fashion

For illustration lets use the built in mpg data.

> mpg %>% select(model) %>% unique()
#   model             
#   <chr>             
# 1 a4                
# 2 a4 quattro        
# 3 a6 quattro  
# ...

I want to change all values with "a4 quattro" into "a4" and "a6 quattro" into "a6". I know about gsub

> mpg <- mpg %>% mutate(model = gsub("a4 quattro", "a4", model))
> mpg <- mpg %>% mutate(model = gsub("a6 quattro", "a6", model))

But is there a way for me to do this in one line?

Furthermore, is there a way to generalize this further? Say I got a nested list type object with structure

> a
# $a4
# [1] "a4 quattro" "a4 model 2" "model 3"   
#
# $a6
# [1] "a6 quattro" "model k" 

Is there an easy way to change all instances of the elements in a$a4 that exists in mpg (our data) into the name of the sub list "a4" and the same for a$a6 (and potentially more list elements in a)? Alternatively is there a "better" data structure to use for this?

I want this to be done in a "tidyverse" fashion. Purrr functionality is ok, but not for loops.

Thanks in advance.

Upvotes: 0

Views: 65

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388807

You could use recode to do an exact match and replace one value with another.

library(tidyverse)
mpg %>% 
   mutate(model = recode(model, 'a4 quattro' = 'a4', 'a6 quattro' = 'a6'))

Maybe if you have some pattern like here you could use some regex to achieve desired output.

mpg %>% 
  mutate(model = sub(' quattro', '', model))

For limited values you can use case_when :

mpg %>%
  mutate(model = case_when(model %in% c("a4 quattro", "a4 model 2", "model 3") ~ 'a4', 
                           model %in% c("a6 quattro", "model k") ~'a6', 
                           TRUE ~ model))

A more general solution if you have a list already you could convert it into dataframe and join with the original data.

a <- list(a4 = c("a4 quattro", "a4 model 2", "model 3"), 
          a6 = c("a6 quattro", "model k"))

enframe(a) %>%
  unnest(value) %>%
  inner_join(mpg, by = c('value' = 'model'))

Upvotes: 2

Related Questions