Numpy ndarray containing objects of variable size (arrays of objects)

Question

Good evening,

I am currently working on a first year university project to simulate continuum percolation. This involves randomly distributing some discs/spheres/hyperspheres across a square/cube/hypercube in n dimensional space and finding a cluster of connected particles that spans the boundaries.

In order to speed up what is essentially collision detection between all these particles to group them up into connected clusters, I have decided to use spatial partitioning so my program scales nicely with number of particles. This requires me to divide the n dimensional space up with evenly sized boxes/cubes/hypercubes and place particles inside the relevant boxes so that an optimised collision check may be done which requires less comparisons since only particles lying in the boxes/cubes/hypercubes adjacent to that in which the new particle lies need to be checked. All the detail has been worked out algorithmically.

However, it seemed like a good idea to use an ndarray which has "dimension" equal to that of the space being studied. Then each "point" in the ndarray would itself contain an array of particle objects. It would be easy to look at the objects in the ndarray existing in coordinates around that of the new particle and cycle through the arrays contained in those which would themselves contain the other particles against which the check must be done. I then found out that ndarray can only contain objects of a fixed size, which these arrays of particles are not since they grow as particles are randomly added to the system.

Would a normal numpy array of array of array (etc..) be the only solution or do structures similar to ndarray but able to accomodate objects of variable size exist? Ndarray seemed great because it is part of numpy which is written in the compiled language c so it would be fast. Furthermore an ndarray would not require and loops to construct as I believe an array of arrays of arrays (etc...) would (NB: dimensionality of space and the increments of spatial division are not constant as particles of different radii can be added, meaning a change in the size of the spatial division squares/cubes/hypercubes).

Speed is very important in this program and it would be a shame to see the algorithmically good optimisations I have found be ruined by bad implementation!

nneonneo · Accepted Answer

Have you considered using a kd-tree instead? kd-trees support fast enumeration of the neighbours of a point by splitting the space (much like you suggested with the multidimensional arrays).

As a nice bonus, there's already a decent kd-tree implementation in SciPy, the companion project to NumPy: scipy.spatial.KDTree.

Numpy ndarray containing objects of variable size (arrays of objects)

Answers (1)

Related Questions