Reputation: 59731
One of the NumPy array flags is OWNDATA
, which the documentation describes:
OWNDATA (O)
The array owns the memory it uses or borrows it from another object.
I was wondering if there is any use at all for this flag, at least as a piece of information in the public API. There are some questions mentioning this flags, like How can I tell if NumPy creates a view or a copy? or Numpy reshape on view, which suggest that OWNDATA
should generally not be used to determine whether an array is a copy or a view. But I have not found cases where the value of the flag is actually useful.
I was thinking about it with an example like this:
import numpy as np
a = np.tile([1], 3)
print(a)
# [1 1 1]
print(a.flags)
# C_CONTIGUOUS : True
# F_CONTIGUOUS : True
# OWNDATA : False
# WRITEABLE : True
# ALIGNED : True
# WRITEBACKIFCOPY : False
# UPDATEIFCOPY : False
np.tile
returns a new contiguous array containing the tiled input. In the example, a
is indeed contiguous, but OWNDATA
is False
. Turns out the reason is that there is a reshape at the end of np.tile
, so technically the data is owned by another array that was later reshaped into the result of the function. However, I have no references to that array, and in every respect I should consider a
as owner of its data. I imagine if np.tile
was natively implemented, maybe OWNDATA
would be True
. However, I don't know (and shouldn't know) which NumPy functions are native or not, so it seems to me that OWNDATA
does not give any useful information to end users of the library. I'm not familiar with NumPy memory management and there is probably a reason to have that information internally, but I'm not so sure about having it as a (potentially misleading) publicly accessible array flag.
Does anyone know about any actual, practical use of the OWNDATA
flag?
EDIT: For clarification, I know that the value of OWNDATA
is not related to the fact that the function that generates the array is native (compiled) or not. What I meant is that, while the array returned by tf.tile
does, functionally, owns its data (since the actual owner of the data cannot be accessed anymore), the value of OWNDATA
does not reflect that, and that, maybe, a compiled implementation of the function which didn't use intermediate ndarray
objects might return an array with OWNDATA
set to True
. The point was that different implementation details may lead to different values of OWNDATA
on otherwise functionally equivalent arrays, so it is not clear what the value of the flag OWNDATA
is supposed to represent for a library user or how it may be useful.
Upvotes: 0
Views: 402
Reputation: 231530
I don't look at flags
nearly as much as a __array_interface__
(esp. its data
key).
Whether a method/function is 'native' (compiled?) has nothing to do with OWNDATA.
In [16]: np.arange(12).flags['OWNDATA']
Out[16]: True
In [17]: np.arange(12).reshape(3,4).flags['OWNDATA']
Out[17]: False
In [18]: np.arange(12).reshape(3,4).copy().flags['OWNDATA']
Out[18]: True
reshape
is fast compiled code, but it returns a view
, a new array with its own shape
and strides
, but referencing the arange
data buffer. That 1d arange
array still exists even though I never assigned it to a variable.
The copy
makes a new array with its own data. That copy is more expensive than the reshape, and not usually not needed - unless I need to ensure full independence between arrays.
We can illustrate the consequence(s) of OWNDATA with:
In [19]: x = np.arange(12)
In [20]: y = x.reshape(3,4)
In [21]: z = y.copy()
In [22]: z[0,:] *= 10
In [23]: z
Out[23]:
array([[ 0, 10, 20, 30],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [24]: x # no change
Out[24]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
In [25]: y[0,:] *= 10
In [26]: y
Out[26]:
array([[ 0, 10, 20, 30],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
In [27]: x # changing y changed x
Out[27]: array([ 0, 10, 20, 30, 4, 5, 6, 7, 8, 9, 10, 11])
Upvotes: 1