Reputation: 623
Please note: Title of this question might be ambiguous so I request other users to please edit it. I was not able to come up with a suitable title which fits this problem.
The problem discussed above is a part of an algorithm called RSAA (Relative Support Apriori Algorithm), here's the research paper link: http://dl.acm.org/citation.cfm?id=937663
Problem: I am implementing algorithms like apriori using python, and while doing so I am facing an issue where I have generate patterns (candidate itemsets) like these at each step of the algorithm.
Here's the example:
Input:
input = [[5, 3], [5, 4], [5, 6], [7, 6]]
Output should be:
output = [[5,3,4], [5,3,6], [4,5,6], [5,6,7]]
Each sublist of output list (^) must have only 3 items (example: [5,3,4]) .
The approach to solve this problem should be generic, because in the next step:
Input:
input = [[5,3,4], [5,3,6], [4,5,6], [5,6,7]]
Output:
output = [[5,3,4,6], [4,5,6,7]]
Each sublist of output list (^) must have only 4 items.
( [5,3,4,6] is formed by joining [5,3,4] and [5,3,6]. We can't join [5,3,4] and [5,6,7] because doing so would create [5,3,4,6,7] which will be of length = 5 )
Upvotes: 0
Views: 2693
Reputation: 823
I think your requirement is included in apriori.
I wrote a blog about the algorithm, but unfortunately in chinese.
Here is the link http://www.zealseeker.com/archives/apriori-algorithm-python/
Here is the snippets (also hosted in chinese)
has_infrequent_subset
and apriori_gen
may be the two functions you want.
If the code is useful for you, comment my answer and I'll be glade to continue help you.
It is easy to get the intersection and difference of two sequence in python.
a = set([5, 6])
b = set([6, 7])
c = a & b # get the itersection
if len(c) == len(a) - 1:
return a | b # their union
Upvotes: 1