displayname
displayname

Reputation: 1

Changing value in one Data Frame using condition in another data Frame

I am trying to change the value in one dataframe based on the conditions pulling information from another dataframe.

Lets say I have 2 dataframes linked by the common column ID:

df1 
      ID    a    b    c
1    111   NA   NA   NA
2    222   NA   NA   NA
3    333   NA   NA   NA
4    444   NA   NA   NA
5    555   NA   NA   NA
6    666   NA   NA   NA

df2
       ID        a    b    c
1     111     Open   No  Yes
2     222   Closed   No   No
3     333     Open  Yes  Yes
4     444     Open   No   No
5     555   Closed   No  Yes
6     666     Open  Yes  Yes

All of the columns in df1 are NA with the exception of the ID column. df1 and df2 are not the same size, and df2 has significantly more rows and IDs than df1. df2 also has more columns than df1.

Using the list of IDs in df1, I want to find the specific row in df2 and pull data from corresponding columns into df1. Ill use column 'a' as an example.

For each ID in df1, I want to identify whether that ID had the value OPEN or CLOSED in df2[a]. If the value of df2[a] = OPEN then df1[a] should display OPEN, otherwise it should display CLOSED.

I have tried solving this problem multiple ways using several packages but do not have a ton of experience. Any advice on how to do this is greatly appreciated!

Upvotes: 0

Views: 26

Answers (1)

wibeasley
wibeasley

Reputation: 5297

This is essentially just a left join with dplyr::left_join().

df1 <- 
  tibble::tibble(
    ID = c(111L, 222L, 333L), 
    a = NA
  )

df2 <- 
  tibble::tribble(
    ~ID   , ~a        , ~b      , ~c,
    111L  , "Open"    , "No"    , "Yes",
    222L  , "Closed"  , "No"    , "No",
    333L  , "Open"    , "Yes"   , "Yes",
    444L  , "Open"    , "No"    , "No",
    555L  , "Closed"  , "No"    , "Yes",
    666L  , "Open"    , "Yes"   , "Yes"
  )

df1 |> 
  dplyr::select(                   # Retain only the column(s) you need
    ID
  ) |> 
  dplyr::left_join(df2, by = "ID") # Join on the ID column

Result:

# A tibble: 3 x 4
     ID a      b     c    
  <int> <chr>  <chr> <chr>
1   111 Open   No    Yes  
2   222 Closed No    No   
3   333 Open   Yes   Yes 

Upvotes: 0

Related Questions