How Physically are Arrays Stored (Specifically with dimensions greater than 2)?

Question

What I Know

I know that arrays int ary[] can be expressed in the equivalent "pointer-to" format: int* ary. However, what I would like to know is that if these two are the same, how physically are arrays stored?

I used to think that the elements are stored next to each other in the ram like so for the array ary:

int size = 5;
int* ary = new int[size];
for (int i = 0; i < size; i++) { ary[i] = i; }

This (I believe) is stored in RAM like: ...[0][1][2][3][4]...

This means we can subsequently replace ary[i] with *(ary + i) by just increment the pointers' location by the index.

The Issue

The issue comes in when I am to define a 2D array in the same way:

int width = 2, height = 2;
Vector** array2D = new Vector*[height]
for (int i = 0; i < width; i++) {
    array2D[i] = new Vector[height];
    for (int j = 0; j < height; j++) { array2D[i][j] = (i, j); }
}

Given the class Vector is for me to store both x, and y in a single fundamental unit: (x, y).

So how exactly would the above be stored?

It cannot logically be stored like ...[(0, 0)][(1, 0)][(0, 1)][(1, 1)]... as this would mean that the (1, 0)th element is the same as the (0, 1)th.
It cannot also be stored in a 2d array like below, as the physical RAM is a single 1d array of 8 bit numbers:
- ...[(0, 0)][(1, 0)]...
- ...[(0, 1)][(1, 1)]...
Neither can it be stored like ...[&(0, 0)][&(1, 0)][&(0, 1)][&(1, 1)]..., given &(x, y) is a pointer to the location of (x, y). This would just mean each memory location would just point to another one, and the value could not be stored anywhere.

Thank you in advanced.

user4581301 · Accepted Answer

What OP is struggling with a dynamically allocated array of pointers to dynamically allocated arrays. Each of these allocations is its own block of memory sitting somewhere in storage. There is no connection between them other than the logical connection established by the pointers in the outer array.

To try to visualize this say we make

int ** twodee;
twodee = new int*[4];
for (int i = 0; i < 4; i++)
{
    twodee[i] = new int[4];
}

and then

int count = 1;
for (int i = 0; i < 4; i++)
{
    for (int j = 0; j < 4; j++)
    {
        twodee[i][j] = count++;
    }
}

so we should wind up with twodee looking something like

 1  2  3  4
 5  6  7  8
 9 10 11 12
13 14 15 16

right?

Logically, yes. But laid out in memory twodee might look something like this batsmurph crazy mess:

You can't really predict where your memory will be, you're at the mercy of the whatever memory manager handles the allocations and what already in storage where it might have been efficient for your memory to go. This makes laying dynamically-allocated multi-dimensional arrays out in your head almost a waste of time.

And there are a whole lot of things wrong with this when you get down into the guts of what a modern CPU can do for you. The CPU has to hop around a lot, and when it's hopping, it's ability to predict and preload the cache with memory you're likely to need in the near future is compromised. This means your gigahertz computer has to sit around and wait on your megahertz RAM a lot more than it should have to.

Try to avoid this whenever possible by allocating single, contiguous blocks of memory. You may pick up a bit of extra code mapping one dimensional memory over to other dimensions, but you don't lose any CPU time. C++ will have generated all of that mapping math for you as soon as you compiled [i][j] anyway.

How Physically are Arrays Stored (Specifically with dimensions greater than 2)?

What I Know

The Issue

Answers (2)

Related Questions