Reputation: 21620
Updated: The actual resolution that the compile box which served my compile request was different. In the slower instance I was running code compiled on a SuSE 9 but running on a SuSE 10 box. That was sufficient difference for me to drop it and compare apples to apples. When using the same compile box the results were as follows:
g++ was about two percent slower
delta real 4 minutes delta user 4 mintues delta system 5 seconds
Thanks!
gcc v4.3 vs g++ v4.3 reduced to simplest case used nothing but simple flags
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char **argv)
{
int i=0;
int j=0;
int k=0;
int m=0;
int n=0;
for (i=0;i<1000;i++)
for (j=0;j<6000;j++)
for (k=0;k<12000;k++)
{
m = i+j+k;
n=(m+1+1);
}
return 0;
}
Is this a known issue? The 15% is very repro. and is across the board for real, system, and user time. I have to wait to post the assembly until tomorrow.
Update: I have only tried on one of my compile boxes. I am using SuSE 10.
Upvotes: 3
Views: 680
Reputation:
When compiled with gcc and g++ the only difference I see is within the first 4 lines.
gcc:
.file "loops.c"
.def ___main; .scl 2; .type 32; .endef
.text
.globl _main
g++:
.file "loops.c"
.def ___main; .scl 2; .type 32; .endef
.text
.align 2
.globl _main
as you can see the only difference is that with g++, the alignment (2) occurs on a word boundary. This tiny difference seems to be making the significant performance difference.
Here is a page explaining structure alignment, although it is for ARM/NetWinder it is still applicable as it discusses how alignment works on modern CPUs. You will want to read section 7 specifically "What are the disadvantages of word alignment?" :
http://netwinder.osuosl.org/users/b/brianbr/public_html/alignment.html
and here is a reference on the .align operation:
http://www.nersc.gov/vendor_docs/ibm/asm/align.htm
Benchmarks as requested:
gcc:
john@awesome:~$ time ./loopsC
real 0m21.212s
user 0m20.957s
sys 0m0.004s
g++:
john@awesome:~$ time ./loopsGPP
real 0m22.111s
user 0m21.817s
sys 0m0.000s
I reduced the inner-most iteration to 1200. Results aren't as widespread as I had hoped, but then again the assembly output was generated on windows, and the timings done in Linux. Maybe something different is done behind the scenes in MinGW than it is with gcc for Linux alignment-wise.
Upvotes: 7
Reputation: 7505
One of the reason would be that gcc might have optimized the assignment of m and n, so that they can run in parallel.
That can done like this
m = i+j+k;
n = i+j+k+2;
I am not sure this than improve the performance by 15%. This might give bit of performance boost in multicore CPU. The best way is to compare the assembly code of 2 compilers.
Upvotes: 1
Reputation: 112356
Oh, that is a fun one. But the code you gave us doesn't compile. You need
(int argc, char** argv)
Upvotes: 0
Reputation: 28690
In order to figure out why its slower you'll probably need to take a look at the assemblies that are produced by the compiler. The g++ compiler must be doing something different from the gcc compiler.
Upvotes: 2