P-Gn
P-Gn

Reputation: 24661

Non-contiguous owning numpy arrays: do they exist & when to expect them

I was wondering if there were any situation where a numpy array owning its data is stored non-contiguously.

From a numerical point of view, non-contiguous, row- or column-aligned buffers make sense and are ubiquitous in performance libraries such as IPP. However it seems that numpy by default converts anything passed as an argument of array to a contiguous buffer. This is not really explicitly said in the documentation as far as I understand it.

My question is, does numpy guarantee that any owning array created with np.array is contiguous in memory? More generally, in which situations can we come across a non-contiguous owning array?

EDIT following @Eelco's answer

By non-contiguous, I mean that there is some "empty spaces" in the memory chunk used to store data (strides[1] > shape[0] * itemsize if you will). I do not mean an array whose data is stored using two or more memory allocations — I would be surprised that such an owning numpy array exists. This seems to be consistent with numpy's terminology according to this answer.

By owning arrays, I mean arrays whose .flags.owndata=True. I am not interested in non-owning arrays who can behave wildly indeed.

Upvotes: 6

Views: 1789

Answers (2)

user10253771
user10253771

Reputation: 670

Yes, they do exist.

I have met such arrays several times, especially when dealing with arrays of opencv images. I will give you an example:

In a project, I used the following code to read image arrays for further processing:

img = cv2.imdecode(np.fromfile(file_path, dtype=np.uint8), cv2.IMREAD_UNCHANGED)[:, :, :3]

Sometimes, it will lead to an error in the 'further processing part' of my code, saying "Layout of the output array img is incompatible with cv::Mat". And usually the solution is to add this line:

img = np.ascontiguousarray(img)

which indicates that the original array is NOT contiguous.

By saying "sometimes", I mean for most input images, my code will work smoothly without np.ascontiguousarray, but it will fail for some specific images. So I guess it is because of how those images were created.

Upvotes: 0

Eelco Hoogendoorn
Eelco Hoogendoorn

Reputation: 10769

Ive heard it said (no source, sorry), that indeed all memory-owning arrays are contiguous. And that makes sense; how can you own a non-contiguous block? It implies youd have to make an arbitrary number of fragmented deallocation calls when that hypothetical object gets collected... And I think thats not even possible; I think one can only release the ranges originally allocated. And viewed from the other side; ownership originates at the time of allocation; and we can only ever allocate contiguous blocks. (at least thats how it works on the malloc level; you could have a software-based allocation layer on top of that which implements logic to handle such fragmented ownership; but if any such thing exists its news to me).

Ive contributed to jsonpickle to expand its numpy support, and there this question also came up. The code I wrote there would break (and quite horribly so) if someone were to feed it a non-contiguous owning array; and its been more than a year and I havnt seen any issues been reported; so thats fairly strong empirical evidence id say...

But if you are still worried about this leading to hard to track bugs (I dont think there is a limit to the shenanigans a C lib constructing a numpy array can get up to), id recommend simply asserting at runtime that no such frankenarrays ever get accidentally passed in to the wrong places.

Upvotes: 1

Related Questions