Reputation: 2825
I have n
sets identified by setId
and each one could contain an arbitrary number of elements that is a pair (elementId, priority)
.
My algorithm should take in input two setId
, and give in output a set containing the first m
elements which are in the intersection of the two input set and have the highest priority (sum of priorities).
Example:
n=3, m=1
Set1: { (1, 1), (12, 2) }
Set2: { (1, 4), (23, 6), (33, 22) }
Set3: { (33, 1), (1, 16 }
Input: Set2, Set3
Output: { (33, 23) }
My question is: assuming that I have infinite space what is the best data structures I can use in order to optimize performance?
Of course precomputing all possible intersection isn't a valid answer.
Edit:
Realistic numbers:
n
, sets number, is ~ 10^6
~ 5*10^3
.Upvotes: 3
Views: 1246
Reputation: 308206
Take one of the sets and convert it to a hash map. Iterate the other set, and for each member try to look up the element in the hash map. If you find it, add the result to a heap; if the size of the heap grows to one greater than the number of elements you wish to keep, throw away the lowest item in the heap.
Upvotes: 3