Skilldrick
Skilldrick

Reputation: 70819

Efficiency/speed for trigonometric functions

In a game I'm making, I've got two points, pt1 and pt2, and I want to work out the angle between them. I've already worked out the distance, in an earlier calculation. The obvious way would be to arctan the horizontal distance over the vertical distance (tan(theta) = opp/adj).

I'm wondering though, as I've already calculated the distance, would it be quicker to use arcsine/arccosine with the distance and dx or dy?

Also, might I be better off pre-calculating in a table?

Upvotes: 7

Views: 6916

Answers (9)

Mike Dunlavey
Mike Dunlavey

Reputation: 40659

Tons of good answers here.

By the way, if you use Math.atan2, you get a full 2π of angles out of it.

I would just do it, then run it flat out. If you don't like the speed, and if samples show that you're actually in that code most of the time and not someplace else, try replacing it with table lookup. If you don't need precision closer than 1 degree, you could use a pretty small table and interpolation.

Also, you may want to memoize the function. Why recompute something you already did recently?

Added: If you use a table, it only has to cover angles from 0-45 degrees (and it can be hard-coded). You can get everything else by symmetry.

Upvotes: 5

Ethan
Ethan

Reputation: 546

Aside from all of the wise comments regarding premature optimization, let's just assume this is the hotspot and do a frigg'n benchmark:

bar char

Times are in nanoseconds, scaled to normalize 'acos' between the systems.
'acos' simply assumes unit radius i.e. acos(adj), whereas 'acos+div' means acos(adj/hyp).

System 1 is a 2.4GHz i5 running Mac OS X 10.6.4 (gcc 4.2.1)
System 2 is a 2.83GHz Core2 Quad running Red Hat 7 Linux 2.6.28 (gcc 4.1.2)
System 3 is a 1.66GHz Atom N280 running Ubuntu 10.04 2.6.32 (gcc 4.4.3)
System 4 is a 2.40GHz Pentium 4 running Ubuntu 10.04 2.6.32 (gcc 4.4.3)

Summary: Relative performance is all over the map. Sometimes atan2 is faster, sometimes its slower. Very strangely, on some systems doing acos with a division is faster than doing it without. Test on your own system :-/

Upvotes: 10

kquinn
kquinn

Reputation: 10740

While others are very right to mention that you are almost certainly falling into the pit of premature optimization, when they say that trigonometric functions are O(1) they're not telling the whole story.

Most trigonometric function implementations are actually O(N) in the value of the input function. This is because the trig functions are most efficiently calculated on a small interval like [0, 2π) (or, for the best implementations, even smaller parts of this interval, but that one suffices to explain things). So the algorithm looks something like this, in pseudo-Python:

def Cosine_0to2Pi(x):
    #a series approximation of some kind, or CORDIC, or perhaps a table
    #this function requires 0 <= x < 2Pi

def MyCosine(x):
    if x < 0:
         x = -x
    while x >= TwoPi:
         x -= TwoPi
    return Cosine_0to2Pi(x)

Even microcoded CPU instructions like the x87's FSINCOS end up doing something like this internally. So trig functions, because they are periodic, usually take O(N) time to do the argument reduction. There are two caveats, however:

  1. If you have to calculate a ton of values off the principal domain of the trig functions, your math is probably not very well thought out.
  2. Big-O notation hides a constant factor. Argument reduction has a very small constant factor, because it's simple to do. Thus the O(1) part is going to dominate the O(N) part for just about every input.

Upvotes: 0

simon
simon

Reputation: 7022

I suspect there's a risk of premature optimization here. Also, be careful about your geometry. Your opposite/adjacent approach is a property of right angle triangles, is that what you actually have?

I'm assuming your points are planar, and so for the general case you have them implicitly representing two vectors form the origin (call these v1 v2), so your angle is

theta=arccos(dot(v1,v2)/(|v1||v2|)) where |.| is vector length.

Making this faster (assuming the need) will depend on a lot of things. Do you know the vector lengths, or have to compute them? How fast can you do a dot product in your architecture. How fast is acos? At some point tricks like table lookup (probably interpolated) might help but that will cost you accuracy.

It's all trade-offs though, there really isn't a general answer to your question.

[edit: added commentary]

I'd like to re-emphasize that often playing "x is fastest" is a bit of a mugs game with modern cpus and compilers anyway. You won't know until you measure it and grovel the generated code. When you hit the point that you really care about it at this level for a (hopefully small) piece of code, you can find out in detail what your system is doing. But it's painstaking. Maybe a table is good. But maybe you've got fast vector computations and a small cache. etc. etc. etc. It all amounts to "it depends". Sorry 'bout that. On the other hand, if you haven't reached the point that you really care so much about this bit of code... you probably shouldn't be thinking about it at this level at all. Make it right. Make it clean (which means abstraction as well as code). Then worry about the overhead.

Upvotes: 10

David Thornley
David Thornley

Reputation: 57036

  1. If you're interested in big-O notation, all the methods you might use are O(1).

  2. If you're interested in what works fastest, test it. Write a wrapper function, one that calls your preferred method but can be easily changed, and test with that. Make sure that your application spends a noticeable amount of time doing this, so you aren't wasting your own time. Try whatever ways occur to you. Ideally, run it on more than one different CPU.

I've become very leery of predicting what will take more or less time on modern processors. Lookup tables used to be the answer if you needed speed, but you don't know a priori the effects on caching or how long it's going to take to normalize and look up versus how long it's going to take to do a trig function on a particular CPU.

Upvotes: 1

yetanotherdave
yetanotherdave

Reputation: 234

Given that this is for a game, you probably care about speed. A lookup table is definitely the fastest but you trade accuracy for speed with this method. So how accurate must you be to meet requirements? Only you can answer that. Before you trade accuracy, determine first if you have a speed problem. All of the trigonometric functions are calculated using numerical methods (research numerical analysis to learn more). Some trig functions are have more expensive methods than others because they rely on series that converge more slowly and who knows, your computer may have different implementations for these functions than another computer. At any rate, you can find out for yourself how expensive these functions are by writing some small programs that loop through as many iterations as you desire, with increments of your choosing, all the while timing the outcomes. Then you can pick the fastest method.

Upvotes: 0

shodanex
shodanex

Reputation: 15406

Get it right first ! And then profile and optimize. Table lookup is a good candidate for sure, but be sure to have your calculation right before doing anything fancy

Upvotes: 1

Chris Ballance
Chris Ballance

Reputation: 34337

If you're going to be doing this many times, pre-calculate in a table. Performance will be much better this way.

Upvotes: 5

Harper Shelby
Harper Shelby

Reputation: 16585

From a pure speed standpoint, a precalculated table and a closest-match lookup would be best. It involves some overhead, of course, depending on how fine-grained you need the angle to be, but it's more than worth it if you're doing this calculation a lot (or in a tight loop), as those are going to be expensive calculations.

Upvotes: 1

Related Questions