change value in a column of a table based on a reference

Question

I have a basic question about updating value in certain rows based on a reference. I am sure someone else has asked similar quesions, but I could not find the exact answers. Below is the description of my question. It would be great if someone could suggest a link of answer or provide solutions here. Thank you in advance.

I have a table with more than 10,000 rows, like this:

        Quality       City
a1      pending        NY
a2      pending        LA
a3      pending        LA
a4      pending        SF
a5      pending        SF
a6      pending        NY
a7      pending      Chicago
...
a10000  pending        DC

I would like to update the name in column "Quality" based on a subset of reference, like this:

       Quality        City
a1      bad            NY
a3      good           LA
a5      good           SF
a6      bad            NY
...
a200    good         Chicago

Here is the result I am looking for:

       Quality        City
a1      bad            NY
a2      pending        LA
a3      good           LA
a4      pending        SF
a5      good           SF
a6      bad            NY
a7      pending      Chicago
...
a200    good         Chicago
...
a10000  pending        DC

Onyambu · Accepted Answer

We could do a full_join, then coalesce the quality columns:

library(tidyverse)  
df1 %>% 
   rownames_to_column("id") %>% 
   full_join(rownames_to_column(df2, "id"), by=c("id", "City")) %>% 
   mutate(Quality = coalesce(Quality.y, Quality.x), Quality.x = NULL, Quality.y = NULL)
          id    City Quality
    1     a1      NY     bad
    2     a2      LA pending
    3     a3      LA    good
    4     a4      SF pending
    5     a5      SF    good
    6     a6      NY     bad
    7     a7 Chicago pending
    8 a10000      DC pending
    9   a200 Chicago    good

change value in a column of a table based on a reference

Answers (2)

Related Questions