Jack Straub
Jack Straub

Reputation: 470

Overuse of Method-chaining in Java

I see a lot of this kind of code written by Java developers and Java instructors:

for ( int x = 0 ; x < myArray.length ; x++ )
    accum += (mean() - myArray[x]) * (mean() - myArray[x] );

I am very critical of this because mean() is being invoked twice for every element in the array, when it only has to be invoked once:

double theMean = mean();
for ( int x = 0 ; x < myArray.length ; x++ )
    accum += (theMean - myArray[x]) * (theMean - myArray[x]);

Is there something about optimization in Java that makes the first example acceptable? Should I stop riding developers about this?

*** More information. An array of samples is stored as an instance variable. mean() has to traverse the array and calculate the mean every time it is invoked.

Upvotes: 1

Views: 496

Answers (5)

M A
M A

Reputation: 72854

Normally the compiler will not optimize the method call since it cannot know whether the return value would be the same (this is especially true when mean processes an array as it has no way of checking whether the result can be cached). So yes the mean() method would be invoked twice.

In this case, if you know for sure that the array is kept the same regardless of the values of x and accum in the loop (more generally, regardless of any change in the program values), then the second code is more optimal.

Upvotes: 0

ᴇʟᴇvᴀтᴇ
ᴇʟᴇvᴀтᴇ

Reputation: 12751

Your version will likely run faster, though an optimizing compiler may be able to detect if the mean() method returns the same value every time (e.g. if the value is hard-coded or stored in a field) and eliminate the method call.

If you are recommending this change for efficiency reasons, you may be falling foul of premature optimization. You don't really know where the bottlenecks are in your system until you measure in the appropriate environment under appropriate loads. Even then, improved hardware is often more cost-effective solution than developer time.

If you are recommending it because it will eliminate duplication then I think you might be on stronger ground. If the mean() method took arguments too, it would be especially reasonable to pull that out of the loop and call the method once and only once.

Upvotes: 1

Software Engineer
Software Engineer

Reputation: 16110

Leave your developers alone, it's fine -- it's readable and it works, without introducing unnecessary names and variables.

Optimization should only ever be done under the guidance of a performance monitoring tool which can show you where you're actually slow. And, typically, performance is enhanced more effectively by considering the large scale architecture of an application, not line by line bytecode optimization, which is expensive and usually unhelpful.

Upvotes: 1

arcy
arcy

Reputation: 13123

Yes, some compilers will optimize this to just what you say.

Yes, you should stop riding developers about this.

I think your preferred way is better, but not mostly because of the optimization. It is more clear that the value is the same in both places if it does not involve a method call, particularly in cases where the method call is more complex than the one you have here.

For that matter, I think it's better to write

double theMean = mean();
for (int x=0; x < myArray.length; x++)
{  double curValue = myArray[x];
   double toSquare = theMean - curValue;
   accum += toSquare * toSquare;
}

Because it makes it easier to determine that you are squaring whatever is being accumulated, and just what it is that's being sqaured.

Upvotes: 0

Eran
Eran

Reputation: 393831

You are right. Your way (second code sample) is more efficient. I don't think Java can optimize the first code sample to call mean() just once and re-use its return value, since mean() might have side effects, so the compiler can't decide to call it once if your code calls it twice.

Upvotes: 1

Related Questions