R - Add table colume to reflect the presence of a value in another table

Question

I am moving a task from Excel to R for reproducibility and the ability to handle datafiles with more records than the limits of Excel. The conversion is going well, but I hit a roadblock due to my limited R skills (and that of my on-site colleagues!)

The problem

Subject dataframe

ID    Var1   Var2  
1     0.4    0.3
2     0.4    0.1
3     0.2    0.2
4     0.3    0.7

TypeList dataframe

ID  Type
1    A
1    B
1    C
2    C
3    B
3    A
4    C

What I would like to do is add a TypeA, TypeB, TypeC column from the TypeList dataframe to the Subject dataframe like this:

ID    Var1   Var2     TypeA   TypeB   TypeC
    1     0.4    0.3   TRUE   TRUE    TRUE
    2     0.4    0.1   FALSE  FALSE   TRUE
    3     0.2    0.2   TRUE   TRUE    FALSE
    4     0.3    0.7   FALSE  FALSE   TRUE

This is a simplified example, the actual Subject dataframe has some 2000 entries, the TypeList has about 70k entries for some 4000 types. Of those types, the interest is in 10 of them at this time.

Any suggestions would be greatly appreciated.

arg0naut91 · Accepted Answer

One way of going about it:

library(tidyverse)

df2 <- df2 %>% mutate(spread_var = TRUE, Type = paste0("Type", Type)) %>% 
               spread(Type, spread_var) %>% replace(., is.na(.), FALSE)

df1 <- left_join(df1, df2)

df1

  ID Var1 Var2 TypeA TypeB TypeC
1  1  0.4  0.3  TRUE  TRUE  TRUE
2  2  0.4  0.1 FALSE FALSE  TRUE
3  3  0.2  0.2  TRUE  TRUE FALSE
4  4  0.3  0.7 FALSE FALSE  TRUE

R - Add table colume to reflect the presence of a value in another table

Answers (2)

Related Questions