funny man
funny man

Reputation: 105

How to speed up memory allocation for 2D triangular matrix in c++?

I need to allocate memory for a very large array which represents triangular matrix. I wrote the following code:

const int max_number_of_particles=20000;
float **dis_vec;

dis_vec = new float **[max_number_of_particles];

for (i = 0; i<max_number_of_particles; i++)
  dis_vec[i] = new float *[i];

for (i = 0; i<max_number_of_particles; i++)
  for (j = 0; j<i; j++)
    dis_vec[i][j] = new float[2];

The problem is that the time needed to do it (to allocate the memory) quickly increases with the increasing size of matrix. Does anyone know better solution for this problem?

Thanks.

Upvotes: 2

Views: 1138

Answers (2)

user213265
user213265

Reputation:

Yes.

First... start with your inner loop.

"new float[2]"

That allocates an array, which I imagine is slower to allocate than a fixed size object that happens to have 2 floats.

struct Float2D { float a; float b; };

x = new Float2D;

that seems better.

But really, forget all that. If you want it fast... just malloc a bunch of floats.

I'd say... let some floats go to waste. Just alloc a plain old 2D array.

float* f = (float*)malloc( max_number_of_particles*max_number_of_particles*2*sizeof(float) );

The only size saving you could get over this, is a 2x size saving by using a triangle instead of a square.

However, I'm pretty damn sure you KILLED that entire "size saving" already, by using "new float[2]", and "new float *[i];". I'm not sure how much the overhead of "new" is, but I imagine it's like malloc except worse. And I think most mallocs have about 8 bytes overhead per allocation.

So what you have already is WORSE than a 2X size lost by allocating a square.

Also, it makes the math simpler. You'd need to do some wierd "Triangular number" math to get the pointer. Something like (n+1)*n/2 or whatever it is :)

Upvotes: 0

lijie
lijie

Reputation: 4871

Allocate a one dimensional array and convert indices to subscripts and vice versa. One allocation compared to O(N) allocations should be much faster.

EDIT

Specifically, just allocate N(N+1)/2 elements, and when you want to access [r][c] in the original, just access [r*(r+1)/2 + c] instead.

Upvotes: 5

Related Questions