Reputation: 6260
Given several sets of elements, e.g.:
int set1[5] {5601, 935, 4153, 2195, 422};
int set2[5] {5601, 935, 23, 44, 422};
int set3[5] {4205, 935, 4153, 2195, 15};
int set4[5] {4205, 589, 4015, 44, 422};
Where order matters (i.e. 1, 2, 3 is different to 2, 1, 3), what is an efficient algorithm to locate a specific set? E.g., you want to locate:
int value[5] {5601, 935, 23, 44, 422};
Considerations:
Insertion cost for new sets is not an issue, so they can be stored in any data structure, in order to optimize search time.
The sets will contain anywhere between 1 and 1,000,000 elements each (approximately, and there will be anywhere between 1 and 1000 sets (again, approximately). However the number of elements will always be the same for any given set of sets (e.g. if one set has 10 elements, then all sets will have 10 elements).
Follow-up question, I'll be implementing this in C++ so I'd be interested to find out for any recommended algorithms, whether they exist in an open source C++ library (preferably STL, Boost, or QT, but I'll consider others too).
Upvotes: 1
Views: 241
Reputation: 526
In this case I would use a Hashtable. You would have accesstime in somethin of O(1)
(well worst-case is O(n)
but with a good Hashfunction this isn't a problem)
So if your Hashtabel is big enough and you don't have to worry about space this would definitely be the fastest way of search. (Consider Binary Search is in O(log(n))
)
Hashtables are only available in the STL of the new C++0x standard. See STL::TR1
Upvotes: 0
Reputation:
If order matters, you're looking at sequences, not sets. Terminology matters.
Since you're only considering about 1,000 sequences, it should be easy to just store them in a hashtable, with good performance. I'd consider constructing a string to represent each sequence, say, by concatenating the string representation of each element, plus some sort of delimiter, and hashing that.
Upvotes: 5
Reputation: 355029
Use a std::vector<set_type>
to store the sets. Insert all of the sets into the container. Sort the container using std::sort
. Find elements using std::binary_search
(or std::lower_bound
if you need an iterator to the element).
The type you use for set_type
depends on the number of elements in each set. If the number of elements is known to be small, then std::array<T, N>
would be sufficient; otherwise, consider std::vector<T>
.
Upvotes: 4
Reputation: 3497
define an order for the sets and then insert them into a tree. Or define a hashcode and a comparator and hashtable them.
Upvotes: 0