user14644617
user14644617

Reputation:

How to match conditions from two different tables (R language)

Suppose I have data like this:

data_a <- data.frame(
   
    "Node_A" = c("John", "John", "John", "Peter", "Peter", "Peter", "Tim", "Kevin", "Adam", "Adam", "Xavier"),
    "Node_B" = c("Claude", "Peter", "Tim", "Tim", "Claude", "Henry", "Kevin", "Claude", "Tim", "Henry", "Claude")
   
)



food <- data.frame(
   
    "Person" = c("John", "Peter", "Tim", "Kevin", "Adam", "Xavier", "Claude", "Henry"),
"Favorite_Food" = c("pizza", "pizza", "tacos", "pizza", "ice cream", "sushi", "sushi", "pizza")
)

I want to make a new column in the "data_a" file called "common". For a given row, if the two people have the same "Favorite_Food" ("food"), then the value of "common" will be "1" else "0".

I am not sure how to begin to solve this problem.

I tried to create the following logic:

data_a$common = ifelse(c(data_a$Node_A, food$Person, food$Favorite_Food) = c(data_a$Node_B, food$Person, food$Favorite_Food)), data_a$common = "1", "0")

But I am not sure this is correct.

Could someone please show me how to do this? Thanks

Upvotes: 1

Views: 85

Answers (2)

Ben
Ben

Reputation: 30474

If you did want to use match within an ifelse you could try the following.

By itself, using match(data_a$Node_A, food$Person) will give you the index or position of the first matching Node_A name in food$Person (Node_A is value to be matched, food$Person is the value to be matched against):

[1] 1 1 1 2 2 2 3 4 5 5 6

For example, the 4th person in Node_A is Peter, and the fourth element in the resultant vector here is 2 - so the second row in food will be the desired food for Peter (pizza).

By taking the match result and including inside of food$Favorite_Food such as food$Favorite_Food[2], you will get the corresponding favorite food for Peter in the second row.

The same thing can be done for NodeB and compared.

data_a$common <- ifelse(
  food$Favorite_Food[match(data_a$Node_A, food$Person)] == 
  food$Favorite_Food[match(data_a$Node_B, food$Person)], 1, 0)

Output

   Node_A Node_B common
1    John Claude      0
2    John  Peter      1
3    John    Tim      0
4   Peter    Tim      0
5   Peter Claude      0
6   Peter  Henry      1
7     Tim  Kevin      0
8   Kevin Claude      0
9    Adam    Tim      0
10   Adam  Henry      0
11 Xavier Claude      1

Upvotes: 1

Rui Barradas
Rui Barradas

Reputation: 76402

Here is a base R solution. It uses match twice to get the food for each person and then compares the two foods per row.

i <- match(data_a$Node_A, food$Person)
j <- match(data_a$Node_B, food$Person)
data_a$common <- as.integer(food$Favorite_Food[i] == food$Favorite_Food[j])

data_a
#   Node_A Node_B common
#1    John Claude      0
#2    John  Peter      1
#3    John    Tim      0
#4   Peter    Tim      0
#5   Peter Claude      0
#6   Peter  Henry      1
#7     Tim  Kevin      0
#8   Kevin Claude      0
#9    Adam    Tim      0
#10   Adam  Henry      0
#11 Xavier Claude      1

Final clean up.

rm(i, j)

Upvotes: 0

Related Questions