Sam
Sam

Reputation: 7398

Why does it take longer to access to a previously created variable than a variable just declared?

I've recently ran a benchmark to see whether access times are less for a variable that is declared at the end of a block of variable declarations or after.

Benchmark code (selected variable declared at end of block),

// Benchmark 1    
for (long i = 0; i < 6000000000; i++)
{
    var a1 = 0;
    var b1 = 0;
    var c1 = 0;

    // 53 variables later...

    var x2 = 0;
    var y2 = 0;
    var z2 = 0;

    z2 = 1;     // Write

    var _ = z2; // Read
}

Benchmark code (selected variable declared at start of block),

// Benchmark 2    
for (long i = 0; i < 6000000000; i++)
{
    var a1 = 0;
    var b1 = 0;
    var c1 = 0;

    // 53 variables later...

    var x2 = 0;
    var y2 = 0;
    var z2 = 0;

    a1 = 1;     // Write

    var _ = a1; // Read
}

To my surprise the results (averaged over 3 runs, excluding first build and without optimizations) are as follows,

Benchmark 1: 9,419.7 milliseconds.

Benchmark 2: 12,262 milliseconds.

As you can see accessing the "newer" variable in the above benchmark is 23.18% (2842.3 ms) faster, but why?

Upvotes: 0

Views: 104

Answers (2)

sibi
sibi

Reputation: 39

Trying to think in assembler/closer to hardware, it might be something like this: In the faster version, you still have the address of the previously accessed variable z2 stored in the current register, which then directly can be used again without needing to change its contents (=recalculate the correct memory address) to do the write and read.

It could be an automatic optimization done by the interpreter/compiler. Have you tried other variables instead of z2 for your W/R test at the end of the loop? What happens if you use x2 or y2 or even any of the other variables in the middle? Are the access times for all the variables other than z2 equal or do they differ as well?

Upvotes: 1

usr
usr

Reputation: 171178

Normally, unused locals are deleted by optimizations in basically any optimizing compiler in the world. You are only writing to most variables. This is an easy case for deletion of their physical storage.

The relation between logical locals and their physical storage is highly complex. They might be deleted, enregistered or spilled.

So don't think that var _ = a1; actually result in a read from a1 and a write to _. It does nothing.

The JIT switches off a few optimizations in functions with many (I believe 64) local variables because some algorithms have quadratic running time in the number of locals. Maybe that's why those locals impact performance.

Try it with fewer variables and you will not be able to distinguish variations of this function from one another.

Or, try it with VC++, GCC or Clang. They all should delete the entire loop. I'd be very disappointed if they didn't.

I don't think you are measuring something relevant here. Whatever the result of your benchmark - it helps you nothing with real-world code. If this was an interesting case I'd look at the disassembly but as I said I think this is irrelevant. Whatever I would find it would not be an interesting find.

If you want to learn what code a compiler typically generates you should probably write some simple functions and look at the generated machine code. This can be very instructional.

Upvotes: 5

Related Questions