Numpy - creating overlapping 3D subarrays as vectors that's memory efficient

Question

I'm trying to create a list of all the overlapping subarrays of equal size from a larger 3D array (for patch-based segmentation) where each subarray needs to be flattened (as a 1D vector) so I can make use of the ball tree in sklearn.neighbours.BallTree.

So for example, given a 100x100x100 image, if I were to break this down into 5x5x5 overlapping patches (subarrays), I would have 96x96x96 = 884,736 of them.

However I have not found any way of doing so without numpy allocating more memory for each flattened/vectorized subarray. This seems to be because each subarray is not contiguous in memory.

e.g. For the 100x100x100 image, if I want each 5x5x5 patch as a 1D vector (of length 125), numpy decides to allocate a brand new array in memory for all 884,736 of them which then becomes rather large especially if I want to work with more than a single 100x100x100 image!

I would welcome any solutions for overcoming this memory challenge in python/numpy. I was considering creating a subclass of the numpy.ndarray object which stores a pointer to the location of the patch in the bigger image and but returns the data as a 1D numpy array only when called (and this is then deleted again when not used) but I have not come across enough details on subclassing ndarray objects to do so. I will be really disappointed if the only solution is to implement everything in C/C++ instead. I appreciate any help that can be provided, thanks!

Numpy - creating overlapping 3D subarrays as vectors that's memory efficient

Answers (1)

Related Questions

Numpy - creating overlapping 3D subarrays as vectors that&#39;s memory efficient

Answers (1)

Related Questions

Numpy - creating overlapping 3D subarrays as vectors that's memory efficient