Reputation: 1193
What data structure supports the following set operations efficiently both in time and space?
I can think of 3 different ways to do these operations, suppose that we have two sets, their sizes are both N:
Bit Array:
1. O(N) 2.O(N) 3.O(1) 4.O(1) 5.O(1)
HashTable:
1. O(N) 2.O(N) 3.O(1) 4.O(1) 5.O(1)
Ordered Tree:
1. O(NlogN) 2.O(NlogN) 3.O(logN) 4.O(logN) 5.O(logN)
Bit Array and HashTable are fast but they use too much memory, Ordered Tree is slow but consumes less memory.
Please note: the set may contain other types except integer, such as float number or string
What other data structures are both fast and general, and space efficient?
Upvotes: 1
Views: 2213
Reputation: 6853
I'd suggest Binary Heap (the easiest one), Binomial Heap (speeding up union) and Fibonacci Heap (hardest to implement, but having the best known amortized times for standard operations).
Operation
Binary Heap
Binomial Heap
Fibonacci heap (amortized)
1. union
O(n) O(logn) O(1)
2. difference O(nlogn) O(nlogn) O(nlogn)
3. find(ismemberof) O(n) O(n) O(n)
4. add O(lgn) O(lgn) O(1)
5. delete O(lgn) O(lgn) O(lgn)
However, these structures are mostly used when insert, find/extract min(max), union and delete operations are needed. find and diff operations still have poor running time.
Upvotes: 1
Reputation: 73470
One option is to augment your ordered tree with a bloom filter to speed up the ismemberof
type tests.
I think that the overall behaviour would be something like:
1. O(N log(N) ) 2. O( ? ) 3.O(1) 4.O(log(N)) 5.O( log(N) )
However the exact details will depend on the size of the filter, the size of your sets and the size of your domain.
Another option may be Judy Arrays. I've heard good things about them for this kind of use, but have no personal experience.
Yet another option is a forrest approach (rather than a pure binary tree).
Upvotes: 2