And
And

Reputation: 318

Accelerator restriction: unsupported operation: RSQRTSS

I have a simple nbody implementation code and try to compile it for launching on NVIDIA GPUs (Tesla K20m/Geforce GTX 650 Ti). I use the following compiler options:

-Minfo=all -acc -Minline -Mfpapprox -ta=tesla:cc35/nvidia

Everything works without -Mfpapprox, but when I use it, the compilation fails with the following output:

346, Accelerator restriction: unsupported operation: RSQRTSS

The 346 line writes as:

float rdistance=1.0f/sqrtf(drSquared); 

where

float drSquared=dx*dx+dy*dy+dz*dz+softening;

and dx, dy, dz are float values. This line is inside the #pragma acc parallel loop independent for() construction. What is the problem with -Mfpapprox?

Upvotes: 0

Views: 74

Answers (1)

Mat Colgrove
Mat Colgrove

Reputation: 5646

-Mfpapprox tells the compiler to use very low-precision CPU instructions to approximate DIV or SQRT. These instructions are not supported on the GPU. The GPU SQRT is both fast and precise so no need for a low-precision version.

Actually even on the CPU, I'd recommend you not use -Mfpapprox unless you really understand the mathematics of your code and it can handle a high degree of imprecision (as much as 5-6 bits or ~20Ulps off). We added this flag about 10 years ago since at the time the CPUs divide operation was very expensive. However, CPU performance for divide has greatly improved since then (as has sqrt) so you're generally better off not sacrificing precision for the little bit of speed-up you might get from this flag.

I'll put in an issue report requesting that the compiler ignore -Mfpapprox for GPU code so you wont see this error.

Upvotes: 1

Related Questions