R.M.
R.M.

Reputation: 3643

How to get an accurate performance measure?

In our project we're trying to automatically monitor the performance of test runs, to make sure that we don't have any significant changes in the performance of the program over time.

The problem is that there seems to be a consistent 5% variability in the measures we get. That is, on the same machine with the same program (no recompilation) running the same test we get values that differ by around 5% from run to run. This is way too much for what we want to use the numbers for.

We're already excluding setup costs from the timing considerations - that is, from within C++ code itself we're grabbing the time immediately before and after running the time-critical portions, rather than doing the timing of the whole program on the OS level. We are also doing averaging and outlier exclusion. The problem is that the variability looks to also have long-term trends, so we get tight clustering of times for replicates right after each other, but an hour or two later the times are substantially different. (Unfortunately, spreading the test out over several hours is not feasible.) The tests are also being run on a dedicated machine while "nothing else" is being run on it.

We're not quite sure where the timing variation is coming from, but it may have to do with the processor and the system - there's indications that the size of the variability depends on what machine the program is running on.

Does anyone have an idea where this variation is likely to be coming from, and how to remove it? The tests are running on a dedicated machine, so changing the operating system settings would be possible.

(As indicated by the tags, this is a C++ program running on a x86 Linux system, if that helps clarify things.)

Edit: Response to comments

Our current timing scheme is to use the clock() function from the C standard library, looking at the difference in the return value from before/after the functions we want to test.

The code we're testing should be deterministic, and shouldn't involve heavy IO.

I realize that the situation is a little hazy for a "silver bullet" answer. I guess I'm more looking for a "these are the factors that are important to consider, this is the order you probably should check them in, and here's how you go about checking each of them" type answer.

Upvotes: 1

Views: 614

Answers (1)

Thomas Matthews
Thomas Matthews

Reputation: 57749

I'm amazed you got down to 5% variation.

Unless you can get rid of all the unnecessary things running on your system, you will be getting high variation. This is at the top level.

You OS needs to be deterministic. You need to know what other tasks and threads are running and their durations. For example, there is the clock interrupt. Now, how many other functions are chained to this interrupt? Do these other functions vary?

Is your system isolated? For example, your measurements may vary if your system is connected to a network.

Does your program use external resources? For example a hard drive. If the program writes to the hard drive, the drive will not be deterministic. Files and parts of files may move on the drive. The drive may become fragmented. This fragmentation may cause variance in your measurements.

The operating system memory may get fragmented. Also, the executable's memory may become fragmented. Fragmentation may add to the variance.

Upvotes: 2

Related Questions