Mark
Mark

Reputation: 2889

Replace entire string anywhere in dataframe based on partial match with dplyr

I'm struggling to find the right dplyr code to use grepl or an equivalent to replace values throughout an entire data frame.

i.e.: any cell that contains 'mazda' in it, should have it's entire content replaced with the new string 'A car'

after lots of searching online, the closest I came was:

The emphasis being on applying it to ALL columns.

library(dplyr)
mtcars$carnames <- rownames(mtcars)  # dummy data to test on

This line does the trick for entire sting being an exact match:

mtcars %>% replace(., (.)=='Mazda RX4', "A car")

but my grepl attempt replaces the entire column with "A car" for some reason.

mtcars %>% replace(., grepl('Mazda', (.)), "A car")

Upvotes: 4

Views: 3696

Answers (1)

A. Suliman
A. Suliman

Reputation: 13125

library(dplyr)
mtcars %>% mutate_if(grepl('Mazda',.), ~replace(., grepl('Mazda', .), "A car"))

To understand why you first replace failed see the difference between 'Mazda RX4'==mtcars and grepl('Mazda', mtcars), since you used grepl, replace uses

replace replaces the values in x with indices given in list by those given in values. If necessary, the values in values are recycled.

Now we can use your first method if we make sure to get a suitable output using sapply for example

mtcars %>% replace(., sapply(mtcars, function(.) grepl('Mazda',.)), "A car")

Update:

TO replace multiple patterns we can use stringr::str_replace_all

library(stringr)
library(dplyr)
mtcars %>% mutate_if(str_detect(., 'Mazda|Merc'), 
                    ~str_replace_all(., c("Mazda.*" = "A car", "Merc.*" = "B car")))

Upvotes: 8

Related Questions