mac
mac

Reputation: 165

R: merging data tables based on a custom rule

I am trying to merge two data tables with a rule I am not able to traslate into R code.

Let us assume that we are dealing with customers: each customer can fall within one or more categories, and each category can buy a certain subset of products.

Then I have two data frames to merge, i.e.

df1
customer   category
Anthony    X
Anthony    Y
Beatrix    Y
Charles    Z

df2
product    category
item1      X
item2      Y
item3      Y
item3      Z

df3 = required merge of (df1, df2)
customer   product
Anthony    item1
Anthony    item2
Anthony    item3
Beatrix    item2
Beatrix    item3
Charles    item3

Thanks for help!

Upvotes: 0

Views: 113

Answers (1)

Z.Lin
Z.Lin

Reputation: 29125

Based on your example, I understand it as joining all products associated with each category to each customer's categories. The following will work in this case:

Generate data:

df1 <- read.table(header = T, text = "customer   category
Anthony    X
Anthony    Y
Beatrix    Y
Charles    Z")

df2 <- read.table(header = T, text = "product    category
item1      X
item2      Y
item3      Y
item3      Z")

dplyr package solution:

library(dplyr)
left_join(df1, df2) %>% select(-category)

  customer product
1  Anthony   item1
2  Anthony   item2
3  Anthony   item3
4  Beatrix   item2
5  Beatrix   item3
6  Charles   item3

Edit alternative solution from base package (suggested by lmo):

merge(df1, df2, by="category")[-1]

  customer product
1  Anthony   item1
2  Anthony   item2
3  Anthony   item3
4  Beatrix   item2
5  Beatrix   item3
6  Charles   item3

Upvotes: 1

Related Questions