Reputation: 830
Scanning a two-dimensional array is obviously faster vertically assuming that x
is first index and y
is second (due to less cache misses).
However, to my surprise, when I need to scan the array by using vertical stripes (with stripe breadth being 3), I get results worse by 5% if I process a single stripe vertically than when I do it horizontally. Could somebody explain me how is that possible?
(edit: actually array access in “vertical” method is indeed faster, but this method uses more loops which slow down the whole thing by more than we gain from less cache misses; see the answer).
Code below is just an example, but when I benchmarked in BenchmarkDotNet the same approach to scanning arrays in my scanline-fill algorithm implementation, I got same difference in performance.
Here is my C# benchmark:
private void ProcessVerticalStripesHorizontally(int[,] matrix)
{
int size = matrix.GetLength(0);
for (int x = 1; x < size - 1; x++)
{
for (int y = 0; y < size; y++)
{
// should be slowe because x is changed often
var value = matrix[x, y];
var valueLeft = matrix[x-1, y];
var valueRight = matrix[x+1, y];
}
}
}
private void ProcessVerticalStripesVertically(int[,] matrix)
{
int size = matrix.GetLength(0);
for (int x = 1; x < size-1; x++)
{
// should be faster because x doesn't change often
for (int y = 0; y < size; y++)
{
var value = matrix[x, y];
}
for (int y = 0; y < size; y++)
{
var valueLeft = matrix[x - 1, y];
}
for (int y = 0; y < size; y++)
{
var valueRight = matrix[x + 1, y];
}
}
}
[Test]
public void AccessToArrayTest()
{
int size = 5000;
var matrix = new int[size, size];
ProcessVerticalStripesHorizontally(matrix);
ProcessVerticalStripesVertically(matrix);
for (int run = 0; run < 5; run++)
{
Console.WriteLine("run " + run + ": ");
var stopwatch = Stopwatch.StartNew();
for (int iteration = 0; iteration < 5; iteration++)
{
ProcessVerticalStripesHorizontally(matrix);
}
Console.WriteLine("processing stripes horizontally: "
+ stopwatch.ElapsedMilliseconds + " ms");
stopwatch.Restart();
for (int iteration = 0; iteration < 5; iteration++)
{
ProcessVerticalStripesVertically(matrix);
}
Console.WriteLine("processing stripes vertically: "
+ stopwatch.ElapsedMilliseconds + " ms");
Console.WriteLine();
}
}
Results:
run 0:
processing stripes horizontally: 454 ms
processing stripes vertically: 468 ms
run 1:
processing stripes horizontally: 441 ms
processing stripes vertically: 456 ms
run 2:
processing stripes horizontally: 437 ms
processing stripes vertically: 453 ms
run 3:
processing stripes horizontally: 437 ms
processing stripes vertically: 456 ms
run 4:
processing stripes horizontally: 441 ms
processing stripes vertically: 449 ms
Upvotes: 2
Views: 342
Reputation: 2851
By analyzing your code I can see that the ProcessVerticalStripesHorizontally method will have a O(n^2) - it will do n*n number of loops.
Meanwhile the ProcessVerticalStripesVertically method will have O(n*3n) - it will do n*n*3 number of loops.
Therefore if your matrix if a 100 x 100 we will have the following:
ProcessVerticalStripesHorizontally 100 * 100 = 10,000 number of loops ProcessVerticalStripesVertically 100 * 100 * 3 = 30,000 number of loops
Upvotes: 2