Felipe Alvarenga
Felipe Alvarenga

Reputation: 2652

Counting appearances of combination of elements in R

I have a big client who purchases from me in a high frequency. I would like to know which combinations of products he often purchases together. For example, every time he buys product A he also buys product W. And the same happens for other combination of products.

My goal is to identify which are these combinations of products so that I can offer product W to my other clients who purchase only product A (maybe they are buying product W from my competition without knowing that I sell it).

My data looks like this

   codclient  codproduct            quant         date
1      101249     A                4.1600     2016-10-01
2      101249     W                1.3880     2016-10-01
3      101249     B                1.5268     2016-10-01
4      101249     A                0.8328     2016-11-01
5      101249     W                2.9148     2016-11-01
6      101249     B                2.7760     2016-11-01
7      101249     C                1.8750     2016-11-01
8      101250     A                0.6940     2016-10-01
9      101250     A                7.0000     2016-11-01
10     101251     B               12.0000     2016-11-01
11     101251     C             1000.0000     2016-11-01
12     101252     W             1000.0000     2016-11-01

Using intersect or Reduce(intersect, list = (products_by_month)) I can only see which items are always purchased.

As of now, what I have in mind is counting how many times each combination of products appears in client 49 purchase account across months, then choose those baskets as reference to suggest for my other clients.

I can create the vector of product combinations using combn (every combination of two, or three products would be enough) but I am still missing how to count the times they appear together in each vector of purchased products for each month.

Any thoughts on how to do it?

Upvotes: 0

Views: 554

Answers (2)

kpress
kpress

Reputation: 146

I've been wanting to delve into market basket analytics for a while now and I know there is a specific package for this in r:

https://cran.r-project.org/web/packages/arules/index.html

This may or may not help you but I figured I'd throw it out there just in case.

Upvotes: 2

PhilC
PhilC

Reputation: 787

You can do this with dplyr:

spread(filter(df,codClient == 101249),codproduct,quant)

     codclient      date      A       B        C         W
1       101249 10/1/2016 4.1600  1.5268       NA    1.3880
2       101249 11/1/2016 0.8328  2.7760    1.875    2.9148

Upvotes: 0

Related Questions