Reputation: 2212
I've been trying to convince a friend of mine to avoid using dynamically allocated arrays and start moving over to the STL vectors. I sent him some sample code to show a couple things that could be done with STL and functors/generators:
#include <iostream>
#include <vector>
#include <algorithm>
#include <iterator>
#define EVENTS 10000000
struct random_double {
double operator() () { return (double)rand()/RAND_MAX; }
};
int main(int argc, char **argv){
std::vector<double> vd (EVENTS);
generate(vd.begin(), vd.end(), random_double());
copy(vd.begin(), vd.end(), std::ostream_iterator<double>(std::cout, "\n"));
return 0;
}
His reply to this, although he feels it's more elegant, is that his own code is faster (by almost a factor of 2!) Here's the C code he replied with:
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
#include <string.h>
#define EVENTS 10000000
__inline double random_double() {
return (double)rand()/RAND_MAX;
}
int main(int argc, char **argv){
unsigned int i;
double *vd;
vd = (double *) malloc(EVENTS*sizeof(double));
for(i=0;i<EVENTS;i++){ vd[i]=random_double(); }
for(i=0;i<EVENTS;i++){ printf("%lf\n",vd[i]); }
free(vd);
return 0;
}
So I ran the simple timing test to see just what happens, and here's what I got:
> time ./c++test > /dev/null
real 0m14.665s
user 0m14.577s
sys 0m0.092s
> time ./ctest > /dev/null
real 0m8.070s
user 0m8.001s
sys 0m0.072s
The compiler options, using g++ were: g++ -finline -funroll-loops. Nothing too special. Can anyone tell me why the C++/STL version is slower in this case? Where is the bottleneck, and will I ever be able to sell my friend on using STL containers?
Upvotes: 2
Views: 891
Reputation: 264729
I would argue that you are not even running the same code.
The C code has no error checking and leaks memory on an exception.
To be a fare comparison you need to make the C program do what the C++ program is doing.
bool errorNumber = 0; // Need a way to pass error information back from the function
int main(int argc, char **argv)
{
......
{
vd[i]=random_double();
//
// In C++ this logic is implicit with the use of excptions.
// Any example where you don't do error checking is not valid.
// In real life any code has to have this logic built in by the developer
//
if (errorNumber != 0)
{ break;
}
}
........
free(vd); The cost of freeing the memory is not zero that needs to be factored in.
return 0;
}
Upvotes: 1
Reputation: 5813
One trick for getting an idea of both the difference in speed between two implementations - and the reasons for it - is to delve into the assembly. Assembly really isn't that scary, and shows you exactly what's going on. It's also really handy for seeing what the compiler optimizes out. Generally speaking, more assembly instructions = longer, but keep in mind that some instructions take much longer than others.
In Visual Studio (and, I suspect, many other IDs) there's an option to look over assembly interleaved with the corresponding C++ lines. (In VC, this is Debug->Windows->Dissassembly).
Upvotes: 1
Reputation: 41519
Believing in the bad performance of the insertion iterator of std::cout
, I tried to insert the following functor:
struct Print {
void operator()( double d ) { printf("lf\n", d); }
};
And use for_each
on the stl container.
generate(vd.begin(), vd.end(), random_double());
//copy(vd.begin(), vd.end(), std::ostream_iterator<double>(std::cout, "\n"));
std::for_each(vd.begin(), vd.end(), Print() );
As a matter of fact, I now got
time.exe raw_vs_stl.exe stl > t.txt
real 0m 2.48s
user 0m 1.68s
sys 0m 0.28s
for the STL version... while the 'raw' version results in more or less the same.
time.exe raw_vs_stl.exe raw > t.txt
real 0m 9.22s
user 0m 7.89s
sys 0m
0.67s Conclusion: vector performance is as good as a raw array's. It's safer and easier to use.
(disclaimer: used VC2005)
Upvotes: 2
Reputation: 14392
There might be occasions were STL is slower, but handrolling maps/sets for multiple insert/remove/lookup would be hard to accomplish.
As Neil pointed out, in speed, printf wins from iostream (also a point in Scott Meyers "More Effective C++, point 23"). However, in more complicated systems. it pays off to be able to write out complete classes inside loggings. The printf way would be to sprintf class information in a function and pass that to the logger as a string parameter. That way, the gain would be smaller.
Upvotes: 0
Reputation: 1619
In high performance situations (such as games), it would be wise to avoid STL containers. Yes they provide excellent functionality, but they also provide a bit of overhead. And that can be disastrous.
Personally, I only use std's file handling and the occasional vector.
But what do I know? ;)
EDIT: Have your friend take a look at Thrust, which attempts to provide STL like functionality for calculating on the GPU.
Upvotes: -1
Reputation: 44832
Using printf:
for (std::vector<double>::iterator i = vd.begin(); i != vd.end(); ++i)
printf("%lf\n", *i);
results are:
koper@elisha ~/b $ time ./cpp > /dev/null
real 0m4.985s
user 0m4.930s
sys 0m0.050s
koper@elisha ~/b $ time ./c > /dev/null
real 0m4.973s
user 0m4.920s
sys 0m0.050s
Flags used: -O2 -funroll-loops -finline
Upvotes: 17
Reputation: 564891
Using STL, especially when using vectors and other nice utility classes, is probably always going to be slower than hand-rolled C code using malloc and inlined functions. There is no real way around it.
That being said, performance is not everything - not nearly so. Using STL provides many other benefits, including:
You're really trying to argue about working at a higher level of abstraction - there are tradeoffs here, typically in terms of performance, but there is a reason nearly all development has gone to higher abstraction levels; the gains are far more valuable than the sacrifices in most cases.
Upvotes: 4
Reputation:
Almost certainly the use of the iostream library versus printf(). If you want to time the algorithm, you should do your output outside the loop.
Upvotes: 19