Reputation: 406
I have a bit of an issue, I was recently told that for an un-ordered value for input, a bunch of random values, lets say 1 Million of them, that using a set would be more efficient than using a vector, and then sorting said vector with the basic sort algorithm function, but when I used them, and checked them through the time function, in the terminal, and valgrind, it showed that both time complexity, and space usage were faster for the vector, even with the addition of the sort function being called. The person who gave me the advice to use the set is a lot more experienced than me in the C++ language, but I always have to test things out myself prior to taking peoples advice. The test codes follow.
For Set
std::set<int> testSet;
for(int i(0); i<= 1000000; ++i)
testSet.insert(-i);
For Vector
std::vector<int> testVector;
for(int i(0); i<= 1000000; ++i)
testVector.push_back(i * -1);
std::sort(testVector.begin(), testVector.end());
I know that these are not random variables, it wouldn't be fair since set does not allow duplicates, and vector does sothey would be different sizes for this basic function point. Can anyone clarify why the set should be used, sans the point of the no duplicates one.
I did not do any tests with the unordered set either. Not too sure of the differences between the two given points.
Upvotes: 2
Views: 2418
Reputation: 385108
This is too vague and ignores/misses out several crucial factors. If your friend said precisely this, then your friend (regardless of his or her experience) was wrong. More likely you are somewhat misinterpreting their words and reading into them a simplified version of matters.
When you want a sorted final product, the sorting is "amortized" when you insert into a set, because you get little bits of sorting action each time. If you will be inserting periodically and many times, then that spreading-out of the workload may be what you want. The total, when added up, may still be more than for a vector (consider the occasional rebalancing and so forth; your vector just needs to be moved to a larger block of memory once in a while), but you've spread it out so as not to noticeably slow down some individual other part of your program.
But if you're just dumping all the elements into a vector and sorting straight away, not only is there less work for the container & algorithm to do but you probably don't mind it taking a noticeable amount of time.
You haven't really stated your use case in any detail so I won't pretend to give specifics here, but the only possible answer to your question as posed is both "it depends" and "the question is fundamentally somewhat meaningless"; you cannot just take two data structures and sorting methodologies, and ask "which is more efficient?" without a use case. You have, however, correctly measured the time and space requirements and if you've done that against your real-world use case then, well, you have your answer don't you?
Upvotes: 6