user545424
user545424

Reputation: 16189

Why is numpy.ravel returning a copy?

In the following example:

>>> import numpy as np
>>> a = np.arange(10)
>>> b = a[:,np.newaxis]
>>> c = b.ravel()
>>> np.may_share_memory(a,c)
False

Why is numpy.ravel returning a copy of my array? Shouldn't it just be returning a?

Edit:

I just discovered that np.squeeze doesn't return a copy.

>>> b = a[:,np.newaxis]
>>> c = b.squeeze()
>>> np.may_share_memory(a,c)
True

Why is there a difference between squeeze and ravel in this case?

Edit:

As pointed out by mgilson, newaxis marks the array as discontiguous, which is why ravel is returning a copy.

So, the new question is why is newaxis marking the array as discontiguous.

The story gets even weirder though:

>>> a = np.arange(10)
>>> b = np.expand_dims(a,axis=1)
>>> b.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> c = b.ravel()
>>> np.may_share_memory(a,c)
True

According to the documentation for expand_dims, it should be equivalent to newaxis.

Upvotes: 11

Views: 2967

Answers (2)

Bi Rico
Bi Rico

Reputation: 25823

It looks like it may have to do with the strides:

>>> c = np.expand_dims(a, axis=1)
>>> c.strides
(8, 8)

>>> b = a[:, None]
>>> b.strides
(8, 0)
>>> b.flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> b.strides = (8, 8)
>>> b.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

I'm not sure what difference the stride on dimension 1 could make here, but it looks like that's what's making numpy treat the array as not contiguous.

Upvotes: 3

mgilson
mgilson

Reputation: 309929

This may not be the best answer to your question, but it looks like inserting a newaxis causes numpy to view the array as non-contiguous -- probably for broadcasting purposes:

>>> a=np.arange(10)
>>> b=a[:,None]
>>> a.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> b.flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

However, a reshape will not cause that:

>>> c=a.reshape(10,1) 
>>> c.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False

And those arrays do share the same memory:

>>> np.may_share_memory(c.ravel(),a)
True

EDIT

np.expand_dims is actually implemented using reshape which is why it works (This is a slight error in documentation I suppose). Here's the source (without the docstring):

def expand_dims(a,axis):
    a = asarray(a)
    shape = a.shape
    if axis < 0:
        axis = axis + len(shape) + 1
    return a.reshape(shape[:axis] + (1,) + shape[axis:])

Upvotes: 6

Related Questions