Reputation: 2914
I'm writing a Ruby C Extension that computes the "soft selection" of a set of vertices. It's many iteration where the distance between 3d points are computed.
Initially I used Pelles C IDE - based it on a template I had found.
I then made an update, where I switched to using nmake
that comes with Visual Studio C++ Express 2010. What I found was a performance drop - which was odd because if anything it should have been faster.
I then reverted back to the original code I had written in Pelles C and compiled it with nmake
and found that the exact same code was slower.
Pelles C
> Updating soft selection took 0.741 seconds (12176 of 21692 Vertices)
> Updating soft selection took 0.751 seconds (10911 of 21692 Vertices)
> Updating soft selection took 0.859 seconds (10765 of 21692 Vertices)
> Updating soft selection took 0.753 seconds (10653 of 21692 Vertices)
> Updating soft selection took 0.75 seconds (10747 of 21692 Vertices)
> Updating soft selection took 0.751 seconds (10822 of 21692 Vertices)
Visual Studio
> Updating soft selection took 1.282 seconds (11853 of 21692 Vertices)
> Updating soft selection took 1.273 seconds (12204 of 21692 Vertices)
> Updating soft selection took 1.286 seconds (11720 of 21692 Vertices)
> Updating soft selection took 1.248 seconds (12996 of 21692 Vertices)
> Updating soft selection took 1.293 seconds (10705 of 21692 Vertices)
> Updating soft selection took 1.276 seconds (12204 of 21692 Vertices)
I'm very inexperienced with C and compiling - but I assume that the performance difference is due to differences between the compiler and the compile instructions?
For the nmake
version I used the Makefile produced by extconf.rb
- for the Pelles C version I used whatever the setting was for the sample project I found.
Am I right in that it's the CFLAGS
that's important here?
CFLAGS?
For the Pelles C project is is:
CCFLAGS = -Tx86-coff -MD -Ot -Ox -W1 -Gd -Ze -Zl#
For the nmake
project it is:
CFLAGS = -MD -Zi -O2b2xg- -G6
When I looked up CFLAGS
and performance it usually mentioned the flags O
, O2
and O3
- now I see an O2
in the nmake
Makefile, but with an odd set of trailing characters.
The Pelles C project has Ot
and Ox
... ?
I was unable to work out the meaning of these. The extension will be compiled under Windows and OSX (PPC and Intel). What configuration of the compiler can I do to get the most performance out of it? At least restore the performance I had.
Makefile and Pelles C configuration
Here is a Pastie of the nmake
Makefile: http://pastie.org/3543595
Here is a Pastie of the Pelles C project file: http://pastie.org/3543597
Upvotes: 2
Views: 533
Reputation: 2914
Ok, been looking up information. What I learned was that the CFLAGS
options depend on the compiler.
I found the options for MS's cl
compiler: http://msdn.microsoft.com/en-us/library/fwkeyyhe%28v=vs.80%29.aspx
I compared them to the options documented in Pelles C's help file.
Recompiled with these CFLAGS
:
$CFLAGS = '-MD -Ot -Ox -W1'
Performance results after recompiling:
> Updating soft selection took 0.679 seconds (12032 of 21692 Vertices)
> Updating soft selection took 0.607 seconds (13470 of 21692 Vertices)
> Updating soft selection took 0.717 seconds (13587 of 21692 Vertices)
> Updating soft selection took 0.613 seconds (13218 of 21692 Vertices)
> Updating soft selection took 0.635 seconds (9964 of 21692 Vertices)
> Updating soft selection took 0.746 seconds (10765 of 21692 Vertices)
Voilà! Performance restored - even looks to be slightly faster. :D
Even got rid of a warning about unknown option -G6
and some other obsolete flag.
Upvotes: 4