Reputation: 438
Is pushing integers into a vector, and then sorting the entire vector faster or slower than inserting the integers into a set, which sorts as you enter them. Sorry, I'm new to c++, and I'm not sure how to use the clock function. Could anyone help? Any help would be greatly appreciated. Thanks in advance.
#include <vector>
#include <iostream>
#include <set>
#include <algorithm>
using namespace std;
vector<int> possibility1;
set<int> possibility2;
int main()
{
random_device rd;
mt19937 rng(rd());
uniform_int_distribution<int> uni(0,1000);
// possibility 1
for(int i = 0; i < 1000000; i++) {
int r = uni(rng);
possibility1.push_back(r);
}
// possibility 2
for(int j = 0; j < 1000000; j++) {
int r = uni(rng);
possibility2.insert(r);
}
}
EDIT
For this case, the difference in time doesn't really matter, but if I have large classes with a lot of private variables, and a make a vector/set of objects (with a compare function, too), which one would be faster?
Upvotes: 3
Views: 1985
Reputation: 238351
Is putting integers into a vector, and then sorting faster, or putting integers into a set
Both have same asymptotic complexity O(n log n).
Set is a more complicated data structure, and it is most certainly going to have higher significant order coefficients and thus will be slower than a sorted vector by some factor in practice.
Whether the complexity difference has any significant impact to the performance of the program, depends on whether the size of the data structure is significant enough compared to other sources of complexity. This can be measured with a profiler.
However, if for example you were to need to keep your data structure sorted between insertions, rather than only after all insertions, then set would be asymptotically simpler O(n log n) than sorted vector O(n2).
Asymptotic complexity only guarantees that beyond some input threshold, set will be more efficient. But due to smaller coefficients, vector is probably still faster for inputs that are below the threshold. Many factors affect the threshold, such as hardware. To find out the threshold on a particular system, you can measure with a profiler.
Upvotes: 4
Reputation: 161
You can pretty simply determine average run times for a function call (like sort) using the C++ high resolution clock and getting the difference between two time_point
's retrieved with now
. This answer contains a good explanation of how to use that bit of the chrono library: C++ Cross-Platform High-Resolution Timer
You can use this kind of measurement to determine how long sorting takes in each case, but you should also bear in mind that set
sorts as you go whereas a vector
is sorted when sort
is called on it. If your application can benefit from amortizing the cost across multiple insertions -- for example, you insert elements into the set and access elements in the set in interleaved operations, vs. inserting all elements at once, sorting, then only accessing the sorted collection later -- set
might be the right choice. Several commenters have already mentioned that the right choice depends on your application, and it's why the advice is to decide on your own benchmarks and then measure them.
Upvotes: 1