Alex Hu
Alex Hu

Reputation: 23

Efficient way of creating a matrix in numpy by duplicating a given array along a diagonal

I have some array that I already know, say:

a = np.array([1,2,3])

I also know that I want a matrix of some size that has total length of length a + some amount n and width n+1, like so:

n = 4
length = len(a) + n
width = n + 1

I'm looking to create an array that looks like this:

array([[1,2,3,0,0,0,0],
       [0,1,2,3,0,0,0],
       [0,0,1,2,3,0,0],
       [0,0,0,1,2,3,0],
       [0,0,0,0,1,2,3]])

Unfortunately numpy.kron and in general block diagonals are not what I'm looking for, since that would cause the next row to increment by 3 instead of 1.

I have a way of doing it where I can create each row of the matrix by using a for loop and stacking the resulting arrays on top of each other as well as a method where I use scipy.sparse.diag to create the array using, again, a for loop, but I was wondering if there was a more efficient method.

Upvotes: 2

Views: 164

Answers (3)

Paul Panzer
Paul Panzer

Reputation: 53099

Here is an inbuilt (sort of) method. Not the fastest, though:

>>> import scipy.linalg as sl
>>> def f_pp(a, n=4):
...     pad = np.zeros((n,), a.dtype)
...     return sl.toeplitz(*map(np.concatenate, ((a[:1], pad), (a, pad))))
... 
>>> f_pp(np.array([1,2,3]))
array([[1, 2, 3, 0, 0, 0, 0],
       [0, 1, 2, 3, 0, 0, 0],
       [0, 0, 1, 2, 3, 0, 0],
       [0, 0, 0, 1, 2, 3, 0],
       [0, 0, 0, 0, 1, 2, 3]])

Upvotes: 0

Divakar
Divakar

Reputation: 221704

Here's one with np.lib.stride_tricks.as_strided that gives us views into a zeros-padded array and as such are very efficient, both memory-wise and performance-wise -

def sliding_windows(a, n=4):
    length = len(a) + n
    width = n + 1
    z_pad = np.zeros(n,dtype=a.dtype)
    ac = np.r_[z_pad, a, z_pad]
    s = ac.strides[0]    
    strided = np.lib.stride_tricks.as_strided
    return strided(ac[n:], shape=(width, length), strides=(-s,s),writeable=False)

If you need a writable version, simply make a copy with sliding_windows(a, n=4).copy().

Sample runs -

In [42]: a
Out[42]: array([1, 2, 3])

In [43]: sliding_windows(a, n=4)
Out[43]: 
array([[1, 2, 3, 0, 0, 0, 0],
       [0, 1, 2, 3, 0, 0, 0],
       [0, 0, 1, 2, 3, 0, 0],
       [0, 0, 0, 1, 2, 3, 0],
       [0, 0, 0, 0, 1, 2, 3]])

In [44]: sliding_windows(a, n=5)
Out[44]: 
array([[1, 2, 3, 0, 0, 0, 0, 0],
       [0, 1, 2, 3, 0, 0, 0, 0],
       [0, 0, 1, 2, 3, 0, 0, 0],
       [0, 0, 0, 1, 2, 3, 0, 0],
       [0, 0, 0, 0, 1, 2, 3, 0],
       [0, 0, 0, 0, 0, 1, 2, 3]])

One more with array-assignment, which should be good if you need a writable version -

def sliding_windows_arrassign(a, n=4):
    pad_length = len(a) + n + 1
    width = n + 1
    p = np.zeros((width,pad_length),dtype=a.dtype)
    p[:,:len(a)] = a
    return p.ravel()[:-n-1].reshape(width,-1)

Benchmarking on larger arrays

1) 100 elements and similar n :

In [101]: a = np.arange(1,101)

In [102]: %timeit sliding_windows(a, n=len(a)+1)
100000 loops, best of 3: 17.6 µs per loop

In [103]: %timeit sliding_windows_arrassign(a, n=len(a)+1)
100000 loops, best of 3: 8.63 µs per loop

# @Julien's soln
In [104]: %%timeit
     ...: n = len(a)+1
     ...: m = np.tile(np.hstack((a,np.zeros(n+1))),n+1)[:(n+len(a))*(n+1)]
     ...: m.shape = (n+1, n+len(a))
100000 loops, best of 3: 15 µs per loop

2) ~5000 elements and similar n :

In [82]: a = np.arange(1,5000)

In [83]: %timeit sliding_windows(a, n=len(a)+1)
10000 loops, best of 3: 23.2 µs per loop

In [84]: %timeit sliding_windows_arrassign(a, n=len(a)+1)
10 loops, best of 3: 28.9 ms per loop

# @Julien's soln
In [91]: %%timeit
    ...: n = len(a)+1
    ...: m = np.tile(np.hstack((a,np.zeros(n+1))),n+1)[:(n+len(a))*(n+1)]
    ...: m.shape = (n+1, n+len(a))
10 loops, best of 3: 34.3 ms per loop

np.lib.stride_tricks.as_strided would have a constant runtime irrespective of the array length owing to the memory efficiency discussed earlier.

Upvotes: 1

Julien
Julien

Reputation: 15206

Here's another more concise one, not sure how it compares for efficiency though:

a = np.array([1,2,3])

n = 4
m = np.tile(np.hstack((a,np.zeros(n+1))),n+1)[:(n+len(a))*(n+1)]
m.shape = (n+1, n+len(a))

Efficiency comparison (writable version):

import numpy as np

a = np.arange(100)
n = 100

def Julien(a, n=4):
    m = np.tile(np.hstack((a,np.zeros(n+1))),n+1)[:(n+len(a))*(n+1)]
    m.shape = (n+1, n+len(a))
    return m

def Divakar(a, n=4):
    length = len(a) + n
    width = n + 1
    z_pad = np.zeros(n,dtype=a.dtype)
    ac = np.r_[z_pad, a, z_pad]
    s = ac.strides[0]    
    strided = np.lib.stride_tricks.as_strided
    return strided(ac[n:], shape=(width, length), strides=(-s,s))

%timeit Julien(a)
%timeit Divakar(a)

18.1 µs ± 333 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
23.4 µs ± 1.05 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Upvotes: 0

Related Questions