Reputation: 443
I have a vector of strings and have to check if each element in vector is present in a given list of 5000 words. Besides the mundane method of two nested loops, is there any faster way to do this in C++?
Upvotes: 9
Views: 4652
Reputation: 76523
So you have a vector of strings, with each string having one or more words, and you have a vector that's a dictionary, and you're supposed to determine which words in the vector of strings are also in the dictionary? The vector of strings is an annoyance, since you need to look at each word. I'd start by creating a new vector, splitting each string into words, and pushing each word into the new vector. Then sort the new vector and run it through the std::unique
algorithm to eliminate duplicates. Then sort the dictionary. Then run both ranges through std::set_intersection
to write the result.
Upvotes: 2
Reputation: 1017
You could sort the vector, then you can solve this with one "loop" (taken that your dictionary is sorted too) which means O(n) not counting in the cost of the sort.
Upvotes: 2
Reputation: 69773
You should put the list of strings into an std::set. It's a data structure optimized for searching. Finding if a given element is in the set or not is an operation which is much faster than iterating all entries.
When you are already using C++11, you can also use the std::unordered_set which is even faster for lookup, because it's implemented as a hash table.
Should this be for school/university: Be prepared to explain how these data structures manage to be faster. When your instructor asks you to explain why you used them, "some guys on the internet told me" is unlikely to earn you a sticker in the class book.
Upvotes: 11
Reputation: 7683
You could put the list of words in an std::unordered_set. Then, for each element in the vector, you just have to test if it is in the unordered_set in O(1). You would have an expected complexity of O(n) (look at the comment to see why it is only expected).
Upvotes: 3