javidcf
javidcf

Reputation: 59731

Is the array flag OWNDATA ever useful?

One of the NumPy array flags is OWNDATA, which the documentation describes:

OWNDATA (O)
    The array owns the memory it uses or borrows it from another object.

I was wondering if there is any use at all for this flag, at least as a piece of information in the public API. There are some questions mentioning this flags, like How can I tell if NumPy creates a view or a copy? or Numpy reshape on view, which suggest that OWNDATA should generally not be used to determine whether an array is a copy or a view. But I have not found cases where the value of the flag is actually useful.

I was thinking about it with an example like this:

import numpy as np
a = np.tile([1], 3)
print(a)
# [1 1 1]
print(a.flags)
#   C_CONTIGUOUS : True
#   F_CONTIGUOUS : True
#   OWNDATA : False
#   WRITEABLE : True
#   ALIGNED : True
#   WRITEBACKIFCOPY : False
#   UPDATEIFCOPY : False

np.tile returns a new contiguous array containing the tiled input. In the example, a is indeed contiguous, but OWNDATA is False. Turns out the reason is that there is a reshape at the end of np.tile, so technically the data is owned by another array that was later reshaped into the result of the function. However, I have no references to that array, and in every respect I should consider a as owner of its data. I imagine if np.tile was natively implemented, maybe OWNDATA would be True. However, I don't know (and shouldn't know) which NumPy functions are native or not, so it seems to me that OWNDATA does not give any useful information to end users of the library. I'm not familiar with NumPy memory management and there is probably a reason to have that information internally, but I'm not so sure about having it as a (potentially misleading) publicly accessible array flag.

Does anyone know about any actual, practical use of the OWNDATA flag?

EDIT: For clarification, I know that the value of OWNDATA is not related to the fact that the function that generates the array is native (compiled) or not. What I meant is that, while the array returned by tf.tile does, functionally, owns its data (since the actual owner of the data cannot be accessed anymore), the value of OWNDATA does not reflect that, and that, maybe, a compiled implementation of the function which didn't use intermediate ndarray objects might return an array with OWNDATA set to True. The point was that different implementation details may lead to different values of OWNDATA on otherwise functionally equivalent arrays, so it is not clear what the value of the flag OWNDATA is supposed to represent for a library user or how it may be useful.

Upvotes: 0

Views: 402

Answers (1)

hpaulj
hpaulj

Reputation: 231530

I don't look at flags nearly as much as a __array_interface__ (esp. its data key).

Whether a method/function is 'native' (compiled?) has nothing to do with OWNDATA.

In [16]: np.arange(12).flags['OWNDATA']                                                        
Out[16]: True
In [17]: np.arange(12).reshape(3,4).flags['OWNDATA']                                           
Out[17]: False
In [18]: np.arange(12).reshape(3,4).copy().flags['OWNDATA']                                    
Out[18]: True

reshape is fast compiled code, but it returns a view, a new array with its own shape and strides, but referencing the arange data buffer. That 1d arange array still exists even though I never assigned it to a variable.

The copy makes a new array with its own data. That copy is more expensive than the reshape, and not usually not needed - unless I need to ensure full independence between arrays.

We can illustrate the consequence(s) of OWNDATA with:

In [19]: x = np.arange(12)                                                                     
In [20]: y = x.reshape(3,4)                                                                    
In [21]: z = y.copy()                                                                          
In [22]: z[0,:] *= 10                                                                          
In [23]: z                                                                                     
Out[23]: 
array([[ 0, 10, 20, 30],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In [24]: x                  # no change                                                                                     
Out[24]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
In [25]: y[0,:] *= 10                                                                          
In [26]: y                                                                                     
Out[26]: 
array([[ 0, 10, 20, 30],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In [27]: x                 # changing y changed x                                                                    
Out[27]: array([ 0, 10, 20, 30,  4,  5,  6,  7,  8,  9, 10, 11])

Upvotes: 1

Related Questions