WolfLink
WolfLink

Reputation: 3317

Fastest Inverse Square Root on iPhone

I'm working on an iPhone app that involves certain physics calculations that are done thousands of times per second. I am working on optimizing the code to improve the framerate. One of the pieces that I am looking at improving is the inverse square root. Right now, I am using the Quake 3 fast inverse square root method. After doing some research, however, I heard that there is a faster way by using the NEON instruction set. I am unfamiliar with inline assembly and cannot figure out how to use NEON. I tried implementing the math-neon library but I get compiler errors because most of the NEON-based functions lack return.

EDIT: I've suddenly been getting some "unclear question" close votes. Although I think its quite clear and those who answered obviously understood, maybe some people need it stated explicitly: How do you use Neon to perform faster calculations? And is it really the fastest method for getting the inverse square root on the iPhone?

EDIT: I did some more formal testing on Neon VS Quake today, but If anything, I'm even more uncertain about the outcome now:

While quake vs neon was too close to say anything for sure in the app performance test, the quake vs 1/sqrtf() was quite clearly cut out in the first test, and the second test was extremely consistent with the values it outputted. What is important in the end, though, is app performance, so I'm going to make my final decision based on that test.

Upvotes: 6

Views: 1301

Answers (2)

DarkDust
DarkDust

Reputation: 92384

The accepted answer of the question you've linked already provides the answer, but doesn't spell it out:

#import <arm_neon.h>

void foo() {
    float32x2_t inverseSqrt = vrsqrte_f32(someFloat);
}

Header and function are already provided by the iOS SDK.

Upvotes: 5

Fj&#246;lnir
Fj&#246;lnir

Reputation: 490

https://code.google.com/p/math-neon/source/browse/trunk/math_sqrtf.c <- there's a neon implementation of invsqrt there, you should be able to copy the assembly bit as-is

Upvotes: 2

Related Questions