Reputation: 24661
I was wondering if there were any situation where a numpy array owning its data is stored non-contiguously.
From a numerical point of view, non-contiguous, row- or column-aligned buffers make sense and are ubiquitous in performance libraries such as IPP. However it seems that numpy by default converts anything passed as an argument of array
to a contiguous buffer. This is not really explicitly said in the documentation as far as I understand it.
My question is, does numpy guarantee that any owning array created with np.array
is contiguous in memory? More generally, in which situations can we come across a non-contiguous owning array?
EDIT following @Eelco's answer
By non-contiguous, I mean that there is some "empty spaces" in the memory chunk used to store data (strides[1] > shape[0] * itemsize
if you will). I do not mean an array whose data is stored using two or more memory allocations — I would be surprised that such an owning numpy array exists. This seems to be consistent with numpy's terminology according to this answer.
By owning arrays, I mean arrays whose .flags.owndata=True
. I am not interested in non-owning arrays who can behave wildly indeed.
Upvotes: 6
Views: 1789
Reputation: 670
Yes, they do exist.
I have met such arrays several times, especially when dealing with arrays of opencv images. I will give you an example:
In a project, I used the following code to read image arrays for further processing:
img = cv2.imdecode(np.fromfile(file_path, dtype=np.uint8), cv2.IMREAD_UNCHANGED)[:, :, :3]
Sometimes, it will lead to an error in the 'further processing part' of my code, saying "Layout of the output array img is incompatible with cv::Mat". And usually the solution is to add this line:
img = np.ascontiguousarray(img)
which indicates that the original array is NOT contiguous.
By saying "sometimes", I mean for most input images, my code will work smoothly without np.ascontiguousarray
, but it will fail for some specific images. So I guess it is because of how those images were created.
Upvotes: 0
Reputation: 10769
Ive heard it said (no source, sorry), that indeed all memory-owning arrays are contiguous. And that makes sense; how can you own a non-contiguous block? It implies youd have to make an arbitrary number of fragmented deallocation calls when that hypothetical object gets collected... And I think thats not even possible; I think one can only release the ranges originally allocated. And viewed from the other side; ownership originates at the time of allocation; and we can only ever allocate contiguous blocks. (at least thats how it works on the malloc level; you could have a software-based allocation layer on top of that which implements logic to handle such fragmented ownership; but if any such thing exists its news to me).
Ive contributed to jsonpickle to expand its numpy support, and there this question also came up. The code I wrote there would break (and quite horribly so) if someone were to feed it a non-contiguous owning array; and its been more than a year and I havnt seen any issues been reported; so thats fairly strong empirical evidence id say...
But if you are still worried about this leading to hard to track bugs (I dont think there is a limit to the shenanigans a C lib constructing a numpy array can get up to), id recommend simply asserting at runtime that no such frankenarrays ever get accidentally passed in to the wrong places.
Upvotes: 1