Supun Chamara
Supun Chamara

Reputation: 3

Frequent item set based on Apriori Algorithm and item based recommendation

I am using Apriori Algorithm and Got the following item sets as the frequent item sets when I used min support= 2.(item set : support) My objective for this implementation is to suggest recommendation to the customer based on his identified frequent item sets and what he has already add to shopping cart

-+- L -+-
[5] : 3
[1] : 3
[2] : 3
[3] : 4

-+- L -+-
[1, 2] : 2
[1, 5] : 2
[3, 5] : 3
[1, 3] : 3
[2, 5] : 3
[2, 3] : 3

-+- L -+-
[1, 2, 5] : 2
[1, 2, 3] : 2
[2, 3, 5] : 3
[1, 3, 5] : 2

-+- L -+-
[1, 2, 3, 5] : 2

My first question is: I used only support rule to identify the above sets. And at which point I should use confidence and lift rules? Is it when identifying frequent item sets or is it when adding recommendations based on identified frequent item sets?

My second question is: I am using confidence rule when adding recommendation how should I check the confidence rule? e.g If user added item 2,5 to his shopping list I would recommend to buy item 3 as well.based on [2,3,5] set.To recommend item 3 how the rule should be? i.e frequency of [2,5] should be closer to frequency of [2,3,5] or frequency of [3] should be closer to frequency of [2,3,5]?

Which condition I must check before suggesting item 3?

My third question is in which situations lift rule will be important? According to the above item set it seems any item can be suggested even though I considered all support,confidence and lift rules. Please correct me if I am wrong?

Thanks

Upvotes: 0

Views: 893

Answers (2)

n01dea
n01dea

Reputation: 1580

To one and two:

Association rules look like:

{3} -> {2, 5}

Which means for example, if a customer buys 3, he buys with a certain probability 2 and 5 too. Again, the probability is determined by the support and confidence. For instance:

> dataset 
1: {1, 2, 3} 
2: {1, 2, 4}
3: {1, 2, 5} 

Support level = 0.6
Confidence level = 0.6
Number of cases = 3

// Get frequency of each item
Total number of 1's bought = 3
Total number of 2's bought = 3
Total number of 3's bought = 1
Total number of 4's bought = 1
Total number of 5's bought = 1

// Check support of each item against support level
Support of 1 = 3 / 3 = 1 >= 0.6 = support level
Support of 2 = 3 / 3 = 1 >= 0.6 = support level
Support of 3 = 1 / 3 = 0.33 <= 0.6 = support level
Support of 4 = 1 / 3 = 0.33 <= 0.6 = support level
Support of 5 = 1 / 3 = 0.33 <= 0.6 = support level
-> Frequent itemsets = {(1), (2), (1, 2)}
-> Association rules = {1 -> 2}

// Check confidence of each association rule against confidence level
Confidence of 1 -> 2 = 3 / 3 = 1 >= 0.6 = confidence level
-> Strong association rules = {1 -> 2}
-> For a customer who buys 1, the recommandation is article 2

To three:

The provided data are frequent itemsets and no association rules. Therefore, from those raw frequent itemsets it's not possible to associate suggestions like if a customer buys 1 he is going to buy 2 too with a certain probability. They need to be processed to association rules. And a lift value just indicates for an association rule in which relation the confidence value of the association rule stays with the expected value. In other words the meaningfulness of this association rule.

Hope this helps.

Upvotes: 1

Has QUIT--Anony-Mousse
Has QUIT--Anony-Mousse

Reputation: 77454

You have done only the first step, frequent itemsets.

Frequent itemsets look like this: 1, 2, 3, 4, 5 and have a support.

Now you need to do the second step, association rules.

You want rules that look like this:

1, 2, 3 -> 4, 5  (confidence: 60%, lift: 1.2)

Which say "if a user has 1, 2 and 3 in their basket then recommend 4 and 5. You then compute confidence and lift to decide which rules to keep and use.

Upvotes: 0

Related Questions