FooBar
FooBar

Reputation: 16478

Efficient dense counterpart to scipy.sparse.diags

scipy.sparse.diags allows me to enter multiple diagonal vectors, together with their location, to build a matrix such as

from scipy.sparse import diags
vec = np.ones((5,))
vec2 = vec + 1
diags([vec, vec2], [-2, 2])

I'm looking for an efficient way to do the same but build a dense matrix, instead of DIA. np.diag only supports a single diagonal. What's an efficient way to build a dense matrix from multiple diagonal vectors?

Expected output: the same as np.array(diags([vec, vec2], [-2, 2]).todense())

Upvotes: 0

Views: 215

Answers (2)

hpaulj
hpaulj

Reputation: 231385

With Paul Panzer's (10,2) case

In [107]: O                                                                     
Out[107]: array([-2, -1,  0,  1])
In [108]: D                                                                     
Out[108]: 
[array([1, 2, 3, 4, 5, 6, 7, 8]),
 array([1, 2, 3, 4, 5, 6, 7, 8, 9]),
 array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10]),
 array([1, 2, 3, 4, 5, 6, 7, 8, 9])]

The diagonals have different lengths.

sparse.diags converts this to a sparse.dia_matrix:

In [109]: M = sparse.diags(D,O)                                                 
In [110]: M                                                                     
Out[110]: 
<10x10 sparse matrix of type '<class 'numpy.float64'>'
    with 36 stored elements (4 diagonals) in DIAgonal format>
In [111]: M.data                                                                
Out[111]: 
array([[ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  0.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.,  0.],
       [ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.],
       [ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.]])

Here the ragged list of diagonals has been converted to a padded 2d array. This can be a convenient way of specifying the diagonals, but it isn't particularly efficient. It has to be converted to csr format for most calculations:

In [112]: timeit sparse.diags(D,O)                                              
99.8 µs ± 3.65 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [113]: timeit sparse.diags(D,O, format='csr')                                
371 µs ± 155 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Using np.diag I can construct the same array with an iteration

np.add.reduce([np.diag(v,k) for v,k in zip(D,O)])

In [117]: timeit np.add.reduce([np.diag(v,k) for v,k in zip(D,O)])              
39.3 µs ± 131 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

and with Paul's function:

In [120]: timeit diags_pp(D,O)                                                  
12.3 µs ± 24.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

The key step in np.diags is a flat assignment:

res[:n-k].flat[i::n+1] = v

This is essentially the same as Paul's outf assignments. So the functionality is basically the same, assigning each diagonal via a slice. Paul streamlines it.

Creating the M.data array (Out[111]) also requires copying the D arrays to a 2d array - but with different slices.

Upvotes: 1

Paul Panzer
Paul Panzer

Reputation: 53029

One way would be to index into the flattened output array using a step of N+1:

import numpy as np
from scipy.sparse import diags
from timeit import timeit

def diags_pp(vecs, offs, dtype=float, N=None):
    if N is None:
        N = len(vecs[0]) + abs(offs[0])
    out = np.zeros((N, N), dtype)
    outf = out.reshape(-1)
    for vec, off in zip(vecs, offs):
        if off<0:
            outf[-N*off::N+1] = vec
        else:
            outf[off:N*(N-off):N+1] = vec
    return out

def diags_sp(vecs, offs):
    return diags(vecs, offs).A

for N, k in [(10, 2), (100, 20), (1000, 200)]:
    print(N)
    O = np.arange(-k,k)
    D = [np.arange(1, N+1-abs(o)) for o in O]
    for n, f in list(globals().items()):
        if n.startswith('diags_'):
            print(n.replace('diags_', ''), timeit(lambda: f(D, O), number=10000//N)*N)
            if n != 'diags_sp':
                assert np.all(f(D, O) == diags_sp(D, O))

Sample run:

10
pp 0.06757194991223514
sp 1.9529316504485905
100
pp 0.45834919437766075
sp 4.684177896706387
1000
pp 23.397524026222527
sp 170.66762899048626

Upvotes: 2

Related Questions