Daniel Beck
Daniel Beck

Reputation: 6513

Performing unbiased program/script performance comparison

I want to perform a comparison of multiple implementations of basically the same algorithm, written in Java, C++ and Python, the latter executed using Pypy, Jython and CPython on a Mac OS X 10.6.4 Macbook Pro with normal (non-SSD) HDD.

It's a "decode a stream of data from a file" type of algorithm, where the relevant measurement is total execution time, and I want to prevent bias through e.g. OS an HDD caches, other programs running simultaneously, too large/small sample file etc. What do I need to pay attention to to create a fair comparison?

Upvotes: 0

Views: 212

Answers (4)

BOMEz
BOMEz

Reputation: 1010

To prevent bias I would recommend first stopping all unnecessary processes from running in the background.

I'm not sure about windows, but under linux you can clear the HDD cache via drop_caches Information on how to use it here

Additionally you may want to take an average for several runs of the application, that way any HDD or OS interference won't skew the results.

Upvotes: 0

Anurag Uniyal
Anurag Uniyal

Reputation: 88737

To get totally unbiased is impossible, you can do various stuff like running minimum processes etc but IMO best way is to run scripts in random order over a long period of time over different days and get average which will be as near to unbias as possible.

Because ultimately code will run in such environment in random order and you are interested in average behavior not some numbers.

Upvotes: 0

Evan Teran
Evan Teran

Reputation: 90422

I would recommend that you simply run each program many times (like 20 or so) and take the lowest measurement of each set. This will make it so it is highly likely that the program will use the HD cache and other things like that. If they all do that, then it isn't biased.

Upvotes: 0

Jay
Jay

Reputation: 14461

These are difficult to do well.

In many cases the operating system will cache files so the second time they are executed they suddenly perform much better.

The other problem is you're comparing interpreted languages against compiled. The interpreted languages require an interpreter loaded into memory somewhere or they can't run. To be scrupulously fair you really should consider if memory usage and load time for the interpreter should be part of the test. If you're looking for performance in an environment where you can assume the interpreter is always preloaded then you can ignore that. Many setups for web servers will be able to keep an interpreter preloaded. If you're doing ad hoc client applications on a desktop then the start up can be very slow while the interpreter is loaded.

Upvotes: 1

Related Questions