thomthom
thomthom

Reputation: 2914

Switched compiler - performance dropped - Trying to understand why

I'm writing a Ruby C Extension that computes the "soft selection" of a set of vertices. It's many iteration where the distance between 3d points are computed.

Initially I used Pelles C IDE - based it on a template I had found.

I then made an update, where I switched to using nmake that comes with Visual Studio C++ Express 2010. What I found was a performance drop - which was odd because if anything it should have been faster.

I then reverted back to the original code I had written in Pelles C and compiled it with nmake and found that the exact same code was slower.

Pelles C

> Updating soft selection took 0.741 seconds (12176 of 21692 Vertices)
> Updating soft selection took 0.751 seconds (10911 of 21692 Vertices)
> Updating soft selection took 0.859 seconds (10765 of 21692 Vertices)
> Updating soft selection took 0.753 seconds (10653 of 21692 Vertices)
> Updating soft selection took 0.75 seconds (10747 of 21692 Vertices)
> Updating soft selection took 0.751 seconds (10822 of 21692 Vertices)

Visual Studio

> Updating soft selection took 1.282 seconds (11853 of 21692 Vertices)
> Updating soft selection took 1.273 seconds (12204 of 21692 Vertices)
> Updating soft selection took 1.286 seconds (11720 of 21692 Vertices)
> Updating soft selection took 1.248 seconds (12996 of 21692 Vertices)
> Updating soft selection took 1.293 seconds (10705 of 21692 Vertices)
> Updating soft selection took 1.276 seconds (12204 of 21692 Vertices)

I'm very inexperienced with C and compiling - but I assume that the performance difference is due to differences between the compiler and the compile instructions?

For the nmake version I used the Makefile produced by extconf.rb - for the Pelles C version I used whatever the setting was for the sample project I found.

Am I right in that it's the CFLAGS that's important here?

CFLAGS?

For the Pelles C project is is: CCFLAGS = -Tx86-coff -MD -Ot -Ox -W1 -Gd -Ze -Zl#

For the nmake project it is: CFLAGS = -MD -Zi -O2b2xg- -G6

When I looked up CFLAGS and performance it usually mentioned the flags O, O2 and O3 - now I see an O2 in the nmake Makefile, but with an odd set of trailing characters.

The Pelles C project has Ot and Ox ... ?

I was unable to work out the meaning of these. The extension will be compiled under Windows and OSX (PPC and Intel). What configuration of the compiler can I do to get the most performance out of it? At least restore the performance I had.

Makefile and Pelles C configuration

Here is a Pastie of the nmake Makefile: http://pastie.org/3543595

Here is a Pastie of the Pelles C project file: http://pastie.org/3543597

Upvotes: 2

Views: 533

Answers (1)

thomthom
thomthom

Reputation: 2914

Ok, been looking up information. What I learned was that the CFLAGS options depend on the compiler.

I found the options for MS's cl compiler: http://msdn.microsoft.com/en-us/library/fwkeyyhe%28v=vs.80%29.aspx

I compared them to the options documented in Pelles C's help file.

Recompiled with these CFLAGS: $CFLAGS = '-MD -Ot -Ox -W1'

Performance results after recompiling:

> Updating soft selection took 0.679 seconds (12032 of 21692 Vertices)
> Updating soft selection took 0.607 seconds (13470 of 21692 Vertices)
> Updating soft selection took 0.717 seconds (13587 of 21692 Vertices)
> Updating soft selection took 0.613 seconds (13218 of 21692 Vertices)
> Updating soft selection took 0.635 seconds (9964 of 21692 Vertices)
> Updating soft selection took 0.746 seconds (10765 of 21692 Vertices)

Voilà! Performance restored - even looks to be slightly faster. :D

Even got rid of a warning about unknown option -G6 and some other obsolete flag.

Upvotes: 4

Related Questions