Thor Russell
Thor Russell

Reputation: 129

Vectorization with floating point in Android

I am calculating a lot of instances of the distance from a n dimensional (10-39) point to an array. I want it to go as fast as possible, for Android 4.0+, specifically for the Galaxy S3. I have got the hardware FPU working, but have heard somewhere that you can speed things up with vectorization and neon. Questions like this Android build system, NEON and non-NEON builds howver don't give me a simple answer.

What is the simplest way I can use this vectorization in the S3 with an example that shows a speed up for this kind of calculation (n dim point distance to n dim vector).

Here is the loop code:

// go through each point in the vector
for (bi=0; bi<sizeOfVect; bi++) { 
    r[bi] = 0.0; // initialise distance

    //calculate distance in each dimension (d is 10-39 depending)
    for (di=0; di<d; di++) { 
        rj[di] = s1[i*d+di] - b[bi*d+di]; 
            // s1 is the n dim point, b is the vector array
        r[bi] += rj[di]*rj[di];
    }
}

Upvotes: 1

Views: 1018

Answers (2)

auselen
auselen

Reputation: 28087

You have two options to get vectorization from a CPU (focusing on ARM); either compiler helps you or you do it yourself.

You can utilize vector instructions (NEON) in an ARM CPU by writing assembly or using intrinsics.

You can get help from compiler but then you have to write vectorizable code. For an example on how to get this see this so post.

Upvotes: 1

Budius
Budius

Reputation: 39846

To use the GPU to perform calculations for you there're two routes:

  • NDK: You'll use the Android Native Development Kit to create your code in C++ with calls specific to that hardware (NEON) to speed your algorithm.
  • Renderscript: You'll use the Renderscript computation API to process all the data and let the framework auto-parallelise it for you between CPU and GPU

I've never worked with any of them, but if I had to go for one route for a specific application I would give a try to renderscript because it is a one code for all devices type of solution (for ICS and up)

Upvotes: 0

Related Questions