Aimen Sattar
Aimen Sattar

Reputation: 23

Lookup values from 2 columns in one df in another df based on row and column names

I have one dataframe which includes the results of a household survey for agricultural production. The data includes unit codes for each, which can be volume, weight or area based. I have created a lookup table that has a conversion for each crop type and unit code. I would like to lookup the conversion factor from the lookup table and input it into the survey dataframe. q14402_701 = crop code (153 codes, with many NA values for incomplete entries) q14404_2_701 = crop unit code(1:5, with many NA values where no conversion is found or incomplete values

df=farmhh_12_sub:

  hhid      indid       q14402_701 q14404_2_701
  <chr>     <chr>            <dbl>        <dbl>
1 120100181 12010018102         1101         2
2 120100191 12010018101         1102         3
3 120101761 12010176103         NA           NA
4 120101771 12010176102         1101         NA
5 120101781 12010176101         1103         2
6 120102891 12010289105         NA           1

lookup=crop_code

code desc_en  1    2 3     4 NA
1 1101   Wheat NA 0.15 1 0.001 NA
2 1102   Maize NA 0.14 1 0.001 NA
3 1103 Sorghum NA 0.14 1 0.001 NA
4 1104     Rye NA   NA 1 0.001 NA
5 1105     Oat NA   NA 1 0.001 NA
6 1106  Millet NA   NA 1 0.001 NA

I have tried a number of methodologies, but the most promising is based on this link.

Farmhh_12_sub$crop_conversion<-left_join(
  Farmhh_12_sub%>% 
    mutate_if(is.double, as.character),
  Crop_code%>%
    mutate_if(is.factor, as.character)%>%
  pivot_longer(!c(code, desc_en), names_to = "q14404_2_701", values_to = "crop_conversion")%>%
  Crop_code[complete.cases(Crop_code), ],
  by=c("code"="q14402_701","q14404_2_701"="q14404_2_701"))

desired output

  hhid      indid       q14402_701 q14404_2_701  crop_conversion
  <chr>     <chr>            <dbl>        <dbl>       <dbl>
1 120100181 12010018102         1101         2          0.15
2 120100191 12010018101         1102         3          0.001
3 120101761 12010176103         NA           NA          NA
4 120101771 12010176102         1101         NA          NA 
5 120101781 12010176101         1103         2           0.14
6 120102891 12010289105         NA           1           NA

Upvotes: 0

Views: 47

Answers (1)

WitheShadow
WitheShadow

Reputation: 542

If I understood the question correct, I would transform the lookup table to a long table and then join the tables with left_join()

library(dplyr)

crop_code_long <- crop_code %>% 
  select(-desc_en) %>% # this column is not needed
  pivot_longer(-code, names_to = "crop_unit_code", values_to = "crop_conversion") %>% 
  mutate(crop_unit_code = as.numeric(crop_unit_code))

left_join(farmhh_12_sub, crop_code_long, by = c("q14402_701" = "code", "q14404_2_701" = "crop_unit_code"))

Hope it works and helps.

Upvotes: 1

Related Questions