TonyHo
TonyHo

Reputation: 254

Which[neon/vfp/vfp3] should I specify for the mfpu when evaluate and compare float performance in ARM processor?

I want to evaluate some different ARM Processor float performance. I use the lmbench and pi_css5, I confuse in the float test.

From cat /proc/cpuinfo(below), I guess there're 3 types of float features: neon,vfp,vfpv3? From this question&answer, it seems it's depend to the compiler. Still I don't know which I should to specify in compille flag(-mfpu=neon/vfp/vfpv3), or I should compile the program with each of that, or just do not specify the -mfpu?

cat /proc/cpuinfo               
Processor       : ARMv7 Processor rev 4 (v7l)
BogoMIPS    : 532.00
Features    : swp half thumb fastmult vfp edsp neon vfpv3 tls 
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x2
CPU part    : 0xc09
CPU revision    : 4

Upvotes: 5

Views: 3975

Answers (2)

TonyHo
TonyHo

Reputation: 254

I have tried each one of them, and it seems using the -mfpu=neon and to specify the -march=armv7-a and -mfloat-abi=softfp is the proper configuration.

Besides, a referrence(ARM Cortex-A8 vs. Intel Atom) is of great useful for ARM BenchMark. Another helpful article is about ARM Cortex-A Processors and gcc command lines, this clears the SIMD coprocessor configuration.

Upvotes: 0

auselen
auselen

Reputation: 28087

It might be even a little bit more complicated then you anticipated. GCC arm options page doesn't explain fpu versions, however ARM's manual for their compiler does. You should also notice that Linux doesn't provide whole story about fpu features, only telling about vfp, vfpv3, vfpv3d16, or vfpv4.

Back to your question, you should select the greatest common factor among them, compile your code towards it and compare the results. On the other hand if a cpu has vfpv4 and other has vfpv3 which one would you think is better?

If your question is as simple as selecting between neon, vfp or vfpv3. Select neon (source).

-mfpu=neon selects VFPv3 with NEON coprocessor extensions.

From the gcc manual,

If the selected floating-point hardware includes the NEON extension (e.g. -mfpu=neon), note that floating-point operations will not be used by GCC's auto-vectorization pass unless `-funsafe-math-optimizations' is also specified. This is because NEON hardware does not fully implement the IEEE 754 standard for floating-point arithmetic (in particular denormal values are treated as zero), so the use of NEON instructions may lead to a loss of precision.

See for instance, Subnormal IEEE-754 floating point numbers support on ios... for more on this topic.

Upvotes: 8

Related Questions