Reputation: 230
Can someone please help me understand why is accessing arrays using a linear increment for index approximately 3-4 times faster than using random index?
Is there any way of making the random index access times faster?
Please consider the following test code, it return ~3 seconds for linear, ~9-10s for random:
public static void test()
{
var arr = new byte[64 * 1024 * 1024];
byte b = 0;
var sw = new Stopwatch();
double timeSum = 0;
for (var i = 0; i < arr.Length; i++)
{
sw.Restart();
b = arr[i];
sw.Stop();
timeSum += sw.Elapsed.TotalMilliseconds;
}
Console.WriteLine("Linear access : " + timeSum + " ms");
timeSum = 0;
var rng = new Random();
var rnum = 0;
for (var i = 0; i < arr.Length; i++)
{
rnum = rng.Next(0, arr.Length - 1);
sw.Restart();
b = arr[rnum];
sw.Stop();
timeSum += sw.Elapsed.TotalMilliseconds;
}
sw.Stop();
Console.WriteLine("Random access : " + timeSum + " ms");
}
Upvotes: 4
Views: 1158
Reputation: 32740
The differences you are seeing in your benchmark (4 to 5 times) can't be explained only by cache lines and sequential access to arrays. Its true that sequential predictable access will be faster but, unless you are managing huge arrays, I'd be surprised if performance gains were even close to those figures.
EDIT With the size of arrays in your benchmark (64x 1024x1024) the difference is staggering, way more than I expected, to be honest. So my first impression was dead wrong!
The problem is your benchmark. You are micro measuring; there is no way you can measure with any sort of confidence individual look ups with System.Diagnostics.Stopwatch
.
Trying to come up with a fair benchmark is surprisingly tricky because there is no easy way to isolate the randomness generation from the look up. I haven't given it much thought, but the following at least tries to compare apples with apples: the trick is to pregenerate random and sequential arrays and then benchmark double look ups:
static void Main(string[] args)
{
lookUpArray(1, new[] { 0 }, new[] {0}); //warmup JITTER
var r = new Random();
const int arraySize = 10000;
const int repetitions = 10000;
var lookupArray = new int[arraySize]; //values dont matter
var sequentialArray = Enumerable.Range(0, arraySize).ToArray();
var randomArray = sequentialArray.Select(i => r.Next(0, arraySize)).ToArray();
for (var i = 0; i < 10; i++)
{
var sw = Stopwatch.StartNew();
lookUpArray(repetitions, lookupArray, randomArray);
sw.Stop();
Console.WriteLine($"Random: {sw.ElapsedMilliseconds} ms");
sw.Reset();
sw.Start();
lookUpArray(repetitions, lookupArray, sequentialArray);
sw.Stop();
Console.WriteLine($"Sequential: {sw.ElapsedMilliseconds} ms");
}
}
private static void lookUpArray(int repetitions, int[] lookupArray, int[] indexArray)
{
for (var r = 0; r < repetitions; r++)
{
for (var i = 0; i < indexArray.Length; i++)
{
var _ = lookupArray[indexArray[i]];
}
}
}
I am no benchmarking expert by any means, so this is probably awful in many ways, but I think its a fairer comparison.
Upvotes: 3