Reputation: 10324

compiler optimisation

I just learned that the compiler cleverly replaces calls to functions and variables by essentially replacing them with the code they represent. With this in mind would the second method in fact be better below (due to clarity) and in fact run just as fast as the first one?

//check to see if sphere intersects the box
bool BoundingBox::Intersects(BoundingSphere boundingSphere)
{
// check for intersection on each axis by seeing if the radius is large enough to reach the edge of the cube on the 
// appropriate side. All must evaluate to true for there to be an intersection.
return (
    ((boundingSphere.Centre().x < negCorner.x && boundingSphere.Radius() > posCorner.x - boundingSphere.Centre().x) ||
     (boundingSphere.Centre().x > posCorner.x && boundingSphere.Radius() > boundingSphere.Centre().x - negCorner.x)) 
     &&
    ((boundingSphere.Centre().y < negCorner.y && boundingSphere.Radius() > posCorner.y - boundingSphere.Centre().y) ||
     (boundingSphere.Centre().y > posCorner.y && boundingSphere.Radius() > boundingSphere.Centre().y - negCorner.y)) 
     &&
    ((boundingSphere.Centre().z < negCorner.z && boundingSphere.Radius() > posCorner.z - boundingSphere.Centre().z) ||
     (boundingSphere.Centre().z > posCorner.z && boundingSphere.Radius() > boundingSphere.Centre().z - negCorner.z)));
}

Second Method:

//check to see if sphere intersects the box
bool BoundingBox::Intersects(BoundingSphere boundingSphere)
{
bool xIntersects, yIntersect, zIntersects;

xIntersects = 
    ((boundingSphere.Centre().x < negCorner.x && boundingSphere.Radius() > posCorner.x - boundingSphere.Centre().x) ||
     (boundingSphere.Centre().x > posCorner.x && boundingSphere.Radius() > boundingSphere.Centre().x - negCorner.x)));

yIntersects =   
            ((boundingSphere.Centre().y < negCorner.y && boundingSphere.Radius() > posCorner.y - boundingSphere.Centre().y) ||
     (boundingSphere.Centre().y > posCorner.y && boundingSphere.Radius() > boundingSphere.Centre().y - negCorner.y)));

zIntersects = 
            ((boundingSphere.Centre().z < negCorner.z && boundingSphere.Radius() > posCorner.z - boundingSphere.Centre().z) ||
     (boundingSphere.Centre().z > posCorner.z && boundingSphere.Radius() > boundingSphere.Centre().z - negCorner.z)));

return (xIntersects && yIntersects && zIntersects);
}

Upvotes: 2

Answers (3)

Joel Rondeau

Reputation: 7586

I would argue a combinations of the two. You want to check xIntersects && yIntersects && z Intersects, so make them each their own functions. Like so:

bool BoundingBox::Intersects(BoundingSphere boundingSphere)
{
  return XIntersects(boundingSphere) && YIntersects(boundingSphere) && ZIntersects(boundingSphere);
}

bool BoundingBox::XIntersects(BoundingSphere boundingSphere)
{
  return ((boundingSphere.Centre().x < negCorner.x && boundingSphere.Radius() > posCorner.x - boundingSphere.Centre().x) ||
          (boundingSphere.Centre().x > posCorner.x && boundingSphere.Radius() > boundingSphere.Centre().x - negCorner.x));
}
bool BoundingBox::YIntersects(BoundingSphere boundingSphere)
{
  return ((boundingSphere.Centre().y < negCorner.y && boundingSphere.Radius() > posCorner.y - boundingSphere.Centre().y) ||
          (boundingSphere.Centre().y > posCorner.y && boundingSphere.Radius() > boundingSphere.Centre().y - negCorner.y));
}
bool BoundingBox::ZIntersects(BoundingSphere boundingSphere)
{
  return ((boundingSphere.Centre().z < negCorner.z && boundingSphere.Radius() > posCorner.z - boundingSphere.Centre().z) ||
          (boundingSphere.Centre().z > posCorner.z && boundingSphere.Radius() > boundingSphere.Centre().z - negCorner.z));
}

You get both the speed of the first and the clarity of the second, and an optimizing compiler might even optimize out the function calls.

Upvotes: 1

ltjax

Reputation: 16007

First of all, this is not an accurate sphere vs. box check. It's essentially a broken (it won't report an intersection when the sphere center is contained in the box!) box vs. box-around-a-sphere test . Look for Arvo's algorithm if you want to do this right.

But back to your question: if there's measurable speed difference, and I doubt there is, it most certainly wouldn't be related to as much to inlining, but to the slightly different semantics of the two functions. The first function does lazy evaluation on its top-level via the && operator. So if you get a negative result on the first axis, it'll bump out and not test the other ones. This will likely give you a speed advantage on a very slow computer when you have enough negative results on that axis or the second.

The second function will not check whether you got a negative answer on a prior axis. As a result, it'll always test all three. Now this will probably be faster on a decent computer, since it doesn't have to wait for the results of the first checks to figure out whether it may execute the checks that are coming next. So this has less branch-mispredictions in general, which is often faster than just doing the work of both branches.

Then again, the optimizer might me smart enough to figure out that it can execute the expressions on the other side of the && operator without side-effects. Inlining (either explicitly or via link-time code-generation) actually plays a small role here - as the optimizer needs to look at what the Centre and Radius functions actually do.

But the only sure way to know is look at the generated code and benchmark it!

Upvotes: 1

Steve Townsend

Reputation: 54178

There is no guarantee of your expected behaviour - the compiler would have to be pretty smart to work out that it did not have to calculate all the conditions for x/y/z before returning a result.

In the first version you KNOW you will return on the first failed test. I'd stick with that, and comment and format the code to make it clearer.

Upvotes: 8

compiler optimisation

Answers (3)

Related Questions