chopin_is_the_best
chopin_is_the_best

Reputation: 2101

Vlookup() equivalent in Dplyr

My df looks like:

library(tidyverse) 
df_1 <- tibble::tribble(
          ~sub_date, ~value_1, ~value_2,
          "2020-05",       58,      130,
          "2020-05",       58,       "check",
          "2020-03",       50,      120,
          "2020-03",       55,       "check",
          "2020-03",       55,       "check"
          )

I want to change the values of the rows containing "check" with the reference values here:

df_ref <- tibble::tribble(
            ~sub_date, ~ref_value,
            "2020-05",        123,
            "2020-03",        234
            )

Basically - using the df_ref table as a reference only for the rows containing check.

I want to use the equivalent of a lookup() in excel, used in a if() function.

Final result being:

df_final <- tibble::tribble(
              ~sub_date, ~value_1, ~value_2,
              "2020-05",       58,      130,
              "2020-05",       58,      123,
              "2020-03",       50,      120,
              "2020-03",       55,      234,
              "2020-03",       55,      234
              )

Upvotes: 0

Views: 1866

Answers (4)

caldwellst
caldwellst

Reputation: 5956

You can also just do it a single line with ifelse and match since we aren't really interested in joining the dataframes. Wrap it in as.numeric if you want numeric output.

library(dplyr)

mutate(df_1, 
       value_2 = ifelse(value_2 == "check", 
                        df_ref$ref_value[match(sub_date, df_ref$sub_date)],
                        value_2))

Upvotes: 1

s_baldur
s_baldur

Reputation: 33498

Using base R:

ref_lut <- with(df_ref, setNames(ref_value, sub_date))

df_1$value_2 <- 
  ifelse(df_1$value_2 == "check", ref_lut[df_1$sub_date], df_1$value_2)

df_1

  sub_date value_1 value_2
  <chr>      <dbl> <chr>  
1 2020-05       58 130    
2 2020-05       58 123    
3 2020-03       50 120    
4 2020-03       55 234    
5 2020-03       55 234    

Upvotes: 0

Bruno
Bruno

Reputation: 4150

Good luck when there are multiple matches

library(tidyverse)
df_1 <- tibble::tribble(
  ~sub_date, ~value_1, ~value_2,
  "2020-05",       58,      "130",
  "2020-05",       58,       "check",
  "2020-03",       50,      "120",
  "2020-03",       55,       "check",
  "2020-03",       55,       "check"
)

df_ref <- tibble::tribble(
  ~sub_date, ~ref_value,
  "2020-05",        123,
  "2020-03",        234
)

df_1 %>% 
  left_join(df_ref) %>%
  mutate(value_2_true = ifelse(value_2 == "check",ref_value,value_2)) %>%
  mutate(value_2 = value_2_true %>% as.numeric()) %>% 
  select(-value_2_true,-ref_value)

Upvotes: 0

Ric S
Ric S

Reputation: 9257

This code should work

df_1 %>% 
  mutate(value_2 = as.numeric(na_if(value_2, "check"))) %>% 
  left_join(df_ref, by = "sub_date") %>% 
  mutate(value_2 = coalesce(value_2, ref_value)) %>% 
  select(-ref_value)

Small explanation: first we set to NA all the "check" values thanks to na_if, then we join with the lookup table, then we coalesce the two columns value_2 and ref_value, i.e. take the first non-missing value between the two.


Output

# A tibble: 5 x 3
#   sub_date value_1 value_2
#   <chr>      <dbl>   <dbl>
# 1 2020-05       58     130
# 2 2020-05       58     123
# 3 2020-03       50     120
# 4 2020-03       55     234
# 5 2020-03       55     234

Small note: your df_1 doesn't work as you pasted it in your question. Below I adjusted its values as character so that it works

df_1 <- tibble::tribble(
  ~sub_date, ~value_1, ~value_2,
  "2020-05",       58,      "130",
  "2020-05",       58,      "check",
  "2020-03",       50,      "120",
  "2020-03",       55,      "check",
  "2020-03",       55,      "check"
)

Upvotes: 1

Related Questions