JeliBeanMachine
JeliBeanMachine

Reputation: 17

Why would 2D array access be faster than 1D array access?

I have two programs, compiled with g++ and executed on linux. Program 1 creates a 2D array and then measures how long it takes to access all of its elements 100000 times:

#include <time.h>
#include <iostream>

int main()
{
  clock_t time;
  int i, y, x;

  int matrix[9][9]{{ 0,  1,  2,  3,  4,  5,  6,  7,  8},
                   { 9, 10, 11, 12, 13, 14, 15, 16, 17},
                   {18, 19, 20, 21, 22, 23, 24, 25, 26},
                   {27, 28, 29, 30, 31, 32, 33, 34, 35},
                   {36, 37, 38, 39, 40, 41, 42, 43, 44},
                   {45, 46, 47, 48, 49, 50, 51, 52, 53},
                   {54, 55, 56, 57, 58, 59, 60, 61, 62},
                   {63, 64, 65, 66, 67, 68, 69, 70, 71},
                   {72, 73, 74, 75, 76, 77, 78, 79, 80}};

  time = clock();

  for (i = 0; i < 100000; i++)
  {
    for (x = 0; x < 9; x++)
    {
      for (y = 0; y < 9; y++)
      {
        matrix[x][y];
      }
    }
  }

  time = clock() - time;
  std::cout << "Clicks:     " << time << std::endl;
  std::cout << "Time taken: " << (double) time / CLOCKS_PER_SEC << "s" << std::endl;
}

Program 2 creates a 1D array and also measures how long it takes to access all of its elements 100000 times:

#include <time.h>
#include <iostream>

int main()
{
  clock_t time;
  int i, j;

  int vector[81] = { 0,  1,  2,  3,  4,  5,  6,  7,  8,
                     9, 10, 11, 12, 13, 14, 15, 16, 17,
                    18, 19, 20, 21, 22, 23, 24, 25, 26,
                    27, 28, 29, 30, 31, 32, 33, 34, 35,
                    36, 37, 38, 39, 40, 41, 42, 43, 44,
                    45, 46, 47, 48, 49, 50, 51, 52, 53,
                    54, 55, 56, 57, 58, 59, 60, 61, 62,
                    63, 64, 65, 66, 67, 68, 69, 70, 71,
                    72, 73, 74, 75, 76, 77, 78, 79, 80};

  time = clock();

  for (i = 0; i < 100000; i++)
  {
    for (j = 0; j < 81; j++)
    {
      vector[j];
    }
  }

  time = clock() - time;
  std::cout << "Clicks:     " << time << std::endl;
  std::cout << "Time taken: " << (double) time / CLOCKS_PER_SEC << "s" << std::endl;
}

After executing program 1 my output is:

Clicks:     8106
Time taken: 0.008106s

After executing program 2 my output is:

Clicks:     15958
Time taken: 0.015958s

It is my understanding that a 1D array is stored in a continuous block of memory. Likewise the rows of a static 2D array are stored in contiguous blocks of memory. Conversely the rows of a dynamic 2d array might not be stored in contiguous blocks of memory. If this is true then program 2 should be at least similar in speed to program 1 therefore my question is why would program 1 be remarkably faster than program 2?

Upvotes: 0

Views: 1419

Answers (2)

felix
felix

Reputation: 2222

The loops are likely to be removed (kind of optimising) by compiler, because

  1. You actually did nothing in loops.

  2. The matrix can be treated as a const array.

  3. program 1 is faster than program 2. ( :< )

To see whether the deletion happens to your code during compiling, you can increase the most outer loop by 100 times, and see whether the time needed for execution is increased significantly (not necessarily by exact 100 times).

If true, you can prevent this kind of optimising by doing some actual works in loop (calculate the sum, and don't forget printing it afterwards) and introduce some "unpredictable" changes to your matrix, for example:

srand(10);
for (int i=0; i<9; ++i) {
  matrix[i][i] = rand()%100;
}

And further, compiler may conduct some other optimising to your code, for example, expand your loops, even the address of the element you are visiting (they are no longer calculated at run time), you can prevent this by making the executing times of loops "unpredictable" :

#include <chrono>
#include <iostream>
#include <cstdlib>

int array[100];
int array2[10][10];

int64_t Sum1D(int len) {
  int64_t sum = 0;
  for (int i=0; i<100000; ++i) {
    for (int j=0; j<len; ++j) {
        sum += array[j];
    }
  }
  return sum;
}

int64_t Sum2D(int len1, int len2) {
  int64_t sum = 0;
  for (int i=0; i<100000; ++i) {
    for (int j=0; j<len1; ++j) {
      for (int k=0; k<len2; ++k)
        sum += array2[j][k];
    }
  }
  return sum;
}

int main()
{
  for (int i=0; i<100; ++i) {
    array[i] = rand();
    array2[i%10][i/10] = rand();
  }

  auto time = std::chrono::steady_clock::now();

  //int64_t sum = Sum1D(100);
  int64_t sum = Sum2D(10,10);

  auto duration = std::chrono::steady_clock::now()-time;
  std::cout << sum << "!" << duration.count() << std::endl;

  return 0;
} 

which finally makes program1 slower than program2. ( :> )

Upvotes: 1

Enigma
Enigma

Reputation: 329

Here is what I found:

  1. If you actually make use of the value then the run time is almost the same, for example, change matrix[x][y]; to matrix[x][y] += 1; and vector[j]; to vector[j] += 1;

    >     Clicks:     28519
    >     Time taken: 0.028519s
    

    and

    >     Clicks:     29941
    >     Time taken: 0.029941s
    
  2. Without the above changes, optimize while compiling, g++ -O3 <filename>.cpp, this results in same time, got the same following output for both programs:

    $./a.out

    >     Clicks:     2
    >     Time taken: 2e-06s
    

So, what you are pointing out is compiler optimizations.

Upvotes: 2

Related Questions