geds133
geds133

Reputation: 1485

Efficient way to replace all column values using list of strings R

I have a dataframe called car and am trying to replace certain values in the column called make which contains a number of unique car makes. I am trying to replace any value contained my a list of characters called car_list with the value low_frequency. However I only want whole string replacements, not partial which I have done by using boundaries.

The 'make' column of 'car' looks as such:

 make
 "LANDROVER"
 "VOLKSWAGEN"
 "VAUXHALL"
 "MG-MOTOR UK"
 "ROVER"
 "BWM"

'car_list` looks as such:

"MG-MOTOR UK"
"ROVER"
"ROBIN"

I which to replace all values in make that are contained in car_list with the new value low_frequency. The output should look as such:

 make
 "LANDROVER"
 "VOLKSWAGEN"
 "VAUXHALL"
 "low_frequency"
 "low_frequency"
 "BWM"

My attempt was using gsub and a for loop as I am used to python but I know this is bad practice in R and the solution crashes:

for (string in car_list){
  car['make'] <- gsub(paste('\\b', string, '\\b', sep = ''), 'low_frequency', car['make'])}

Any help would be appreciated.

Upvotes: 0

Views: 667

Answers (1)

jay.sf
jay.sf

Reputation: 72919

Using replace.

replace(car$make, car$make %in% car_list, "low_frequency")
# [1] "LANDROVER"     "VOLKSWAGEN"    "VAUXHALL"      "low_frequency" "low_frequency" "BWM"

or with ""

replace(car[["make"]], car[["make"]] %in% car_list, "low_frequency")

which would look in a function something like:

FUN <- function(x) replace(car[[x]], car[[x]] %in% car_list, "low_frequency")
FUN("make")
# [1] "LANDROVER"     "VOLKSWAGEN"    "VAUXHALL"      "low_frequency" "low_frequency" "BWM"    

  

Data:

car <- structure(list(make = c("LANDROVER", "VOLKSWAGEN", "VAUXHALL", 
"MG-MOTOR UK", "ROVER", "BWM")), class = "data.frame", row.names = c(NA, 
-6L))

car_list <- list("MG-MOTOR UK", "ROVER", "ROBIN")

Upvotes: 1

Related Questions