Imanol Luengo
Imanol Luengo

Reputation: 15889

Numpy - Stacked memory view of two 1D arrays

I know that I can do the following:

import numpy as np
c = np.random.randn(20, 2)
a = c[:, 0]
b = c[:, 1]

Here, a and b are pointers to c's first and second column respectively. Modifying a or b will change c (same reciprocally).

However, what I want to achieve is exactly the opposite. I want to create a 2D memory view where each column (or row) will point to a memory of a different 1D array. Assume that I already have two 1D arrays, is it possible to create a 2D view to these arrays where each row/column points to each of them?

I can create c from a and b in the following way:

c = np.c_[a, b]

However, this copies a's and b memory onto c. Can I just somehow create c as 'view' of [a b], where, by modifying an element of c this reflects in the respective a or b 1D array?

Upvotes: 5

Views: 278

Answers (2)

Jaime
Jaime

Reputation: 67427

While @hpaulj's answer is the correct one, for your particular case, and more as an exercise in understanding numpy memory layout than as anything with practical applications, here's how you can get a view of two 1-D arrays as columns of a common array:

>>> from numpy.lib.stride_tricks import as_strided
>>> a = np.arange(10)
>>> b = np.arange(20, 30)
>>> col_stride = (b.__array_interface__['data'][0] -
                  a.__array_interface__['data'][0])
>>> c = as_strided(a, shape=(10, 2), strides=(a.strides[0], col_stride))
>>> c
array([[ 0, 20],
       [ 1, 21],
       [ 2, 22],
       [ 3, 23],
       [ 4, 24],
       [ 5, 25],
       [ 6, 26],
       [ 7, 27],
       [ 8, 28],
       [ 9, 29]])
>>> c[4, 1] = 0
>>> c[6, 0] = 0
>>> a
array([0, 1, 2, 3, 4, 5, 0, 7, 8, 9])
>>> b
array([20, 21, 22, 23,  0, 25, 26, 27, 28, 29])

There are many things that can go wrong here, mainly that the array b has not had its reference count increased, so if you delete it its memory will be released, but the view will still be accessing it. It can also not be extended to more than two 1-D arrays, and requires that both 1-D arrays have the same stride.

Of course, just because you can do it doesn't mean you should do it! And you should definitely not do this.

Upvotes: 3

hpaulj
hpaulj

Reputation: 231335

I don't think it is possible.

In your first example, the values of the a and b views are interwoven, as can be seen from this variation:

In [51]: c=np.arange(10).reshape(5,2)
In [52]: a, b = c[:,0], c[:,1]
In [53]: a
Out[53]: array([0, 2, 4, 6, 8])
In [54]: c.flatten()
Out[54]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

The data buffer for c and a start at the same memory point; b starts at 4 bytes into that buffer.

In [55]: c.__array_interface__
Out[55]: 
{'strides': None,
 'data': (172552624, False),...}

In [56]: a.__array_interface__
Out[56]: 
{'strides': (8,),
 'data': (172552624, False),...}

In [57]: b.__array_interface__
Out[57]: 
{'strides': (8,),
 'data': (172552628, False),...}

Even if the a,b split were by rows, b would start just further along in the same shared data buffer.

From the .flags we see that c is C-contiguous, b is not. But b values are accessed with constant strides in that shared data buffer.

When a and b are created separately, their data buffers are entirely separate. The numpy striding mechanism cannot step back and forth between these two data buffers. A 2d composite of a and b has to work with its own data buffer.

I can imagine writing a class that ends up looking like what you want. The indexing_tricks file that defines np.c_ might give you ideas (e.g. a class with a custom __getitem__ method). But it wouldn't have the speed advantages of a regular 2d array. And it might be hard to implement all of the ndarray functionality.

Upvotes: 5

Related Questions