Makoto Miyazaki
Makoto Miyazaki

Reputation: 1983

R: Conditional replacement using two data frames

I have a dataframe dflike this:

df <- data.frame(fruits = c("apple", "orange", "pineapple", "banana", "grape"))
df_rep <- data.frame(eng = c("apple", "orange", "grape"), 
                     esp = c("manzana", "naranja", "uva"))
>df
   fruits
    apple
   orange
pineapple
   banana
    grape

>df_rep
   eng        esp
 apple    manzana
orange    naranja
 grape        uva

I want to replace the value in the fruits column of df referring to df_rep. If the values in the fruits column of df appears in eng column of df_rep, I want to replace them with the values in esp column of df_rep. So the result should look like this:

>df
   fruits
  manzana
  naranja
pineapple
   banana
      uva

Point: I don't want to use ifelse as in my real data frame there are more than 100 replacement list. The example here is simplified for easy understanding. Nor for loop as my data frame contains more than 40,000 rows. I am looking for a simple and only one action solution.

Thank you very much for your help!

Upvotes: 1

Views: 339

Answers (2)

markus
markus

Reputation: 26343

Another option is coalesce from dplyr to replace the NAs that result from match with the respective values from df$fruits.

library(dplyr)
df$fruits2 <- coalesce(df_rep$esp[match(df$fruits, df_rep$eng)], df$fruits)
df
#     fruits   fruits2
#1     apple   manzana
#2    orange   naranja
#3 pineapple pineapple
#4    banana    banana
#5     grape       uva

Upvotes: 1

bouncyball
bouncyball

Reputation: 10761

We can use the merge function (to simulate a SQL left join) and then the ifelse function to replace the fruits with non-NA esp values:

df2 <- merge(df, df_rep, by.x = 'fruits', by.y = 'eng', all.x = TRUE)

df2$fruits <- ifelse(is.na(df2$esp), df2$fruits, df2$esp)

#      fruits     esp
# 1   manzana manzana
# 2    banana    <NA>
# 3       uva     uva
# 4   naranja naranja
# 5 pineapple    <NA>

Data

It's important to set stringsAsFactors = FALSE when creating the data:

df <- data.frame(fruits = c("apple", "orange", "pineapple", "banana", "grape"),
                 stringsAsFactors = FALSE)
df_rep <- data.frame(eng = c("apple", "orange", "grape"), 
                     esp = c("manzana", "naranja", "uva"),
                     stringsAsFactors = FALSE)

Upvotes: 2

Related Questions