Midnitte
Midnitte

Reputation: 1

Create dummy variable if a dataframe contains rows from another dataframe

I'm trying to create a dummy variable based on if df1 is contained within df2. Note that df2 has columns more than just the columns in df1.

e.g.:

df1:

A B C
1 2 3
4 5 6
7 8 0

df2:

A B C D
1 2 3 E
4 5 6 F
7 8 9 G

Resulting in: df2:

A B C D Dummy
1 2 3 E 1
4 5 6 F 1
7 8 9 G 0

Any good approaches I should consider?

I've tried using an ifelse function applied to the dataframe, but I suspect I've coded it wrong. Any tips would be appreciated!

Upvotes: 0

Views: 418

Answers (1)

AdroMine
AdroMine

Reputation: 1527

One approach would be to add a column called "dummy" to df1, then join with df2 on all variables of df1.

df1$dummy <- 1
library(dplyr)
dplyr::left_join(df2, df1) %>% 
    mutate(dummy = ifelse(is.na(dummy), 0, dummy))

# Joining, by = c("A", "B", "C")
# A B C D dummy
# 1 2 3 E     1
# 4 5 6 F     1
# 7 8 9 G     0

By default left_join joins using all commonly named variables, but this can be modified as required.

Upvotes: 1

Related Questions