Reputation: 2487
I'm working on an application where runtime speed is more important than precision. The number crunching involves floating point arithmetic and I'm concerned about double
and/or long double
being handled in software instead of natively on the processor (this is always true on a 32-bit arch right?). I would like to conditionally compile using the highest precision with hardware support, but I haven't found a quick and easy way to detect software emulation. I'm using g++ on GNU/Linux and I'm not concerned about portability. It's running on x86 arch, so I'm assuming that float
is always native.
Upvotes: 3
Views: 692
Reputation: 146988
x86 does float
, double
, and more in hardware, and has done for a long time. Many modern 32bit programs assume SSE2 support, as that's been around for several years now and can be depended on to be present on a consumer chip.
Upvotes: 3
Reputation: 248189
(this is always true on a 32-bit arch right?)
No. Common CPU's have dedicated hardware for double
(and in some cases long double
as well). And honestly, if performance is a concern, then you should know your CPU. Hit the CPU manuals, and figure out what the performance penalty for each datatype is.
Even on CPUs that lack "proper" double
support, it still isn't emulated in software. The Cell CPU (of Playstation 3 fame) simply passes a double
twice through the FPU, so it's a lot costlier than a float
computation, but it's not software emulation. You still have dedicated instructions for double
processing. They're just less efficient than the equivalent float
instructions.
Unless you either target 20-year-old CPU's, or small, limited embedded processors, floating-point instructions will be handled in hardware, although not all architectures handle every datatype equally efficiently
Upvotes: 3
Reputation: 272677
The Floating-point unit (FPU) on modern x86 is natively double (in fact, it's even bigger than double), not float (the "32" in 32-bit describes the integer register widths, not the floating-point width). This is not true, however, if your code is taking advantage of vectorized SSE instructions, which do either 4 single or 2 double operations in parallel.
If not, then your main speed hit by switching your app from float to double will be in the increased memory bandwidth.
Upvotes: 3
Reputation: 96291
On x86, the hardware typically uses 80 bits internally, which is more than enough for double.
Are you sure that performance is a real concern (from profiling the code) or just guessing that it may not be supported?
Upvotes: 1