Reputation: 1
I want to perform a "sparse outer product" of two 1-dimensional arrays x and y, subject to a sparse "template matrix" T. For example, I want to compute something like
x= np.array([1,2,3,4])
y= np.array([5,6,7])
T=np.array([[1,0,0],[0,0,1],[1,1,0],[0,1,0]])
np.multiply(T,np.outer(x,y))
which produces
array([[ 5, 0, 0],
[ 0, 0, 14],
[15, 18, 0],
[ 0, 24, 0]])
The challenge is to make this fast. The basic computational problem with this naive code is that it performs many unnecessary multiplications in the outer product. One should only have to perform multiplications where the template is nonzero.
I have tried using SciPy sparse methods, for example:
T_lil=lil_matrix(T)
T_csr=T_lil.tocsr()
diags(x).dot(T_csr.dot(diags(y)))
This theoretically avoids unnecessary multiplications by applying T first to y, then applying x to the result. It gains a speed advantage for large sizes but is so slow for smaller sizes that I know it can't be optimal.
I also tried things like
x_column=np.array([x]).T
(T_csr.multiply(x_column)).multiply(y)
which (after applying .toarray()) gives the same answer, but this is absurdly clunky and again can't be optimal.
I don't think it would help to convert x and y to a sparse encoding, because they are generally NOT sparse in my application.
Can anybody help? For my application, T might have 10^4 rows and 10^5 columns. I don't at all mind digging into the guts of the csr (or csc or coo or dok) encoding, but I expect somebody knows a better answer than I can think of.
Upvotes: 0
Views: 637
Reputation: 281949
Here's something simple you could do with a COO-format T
matrix. Use advanced indexing to multiply T.data
by the correct elements of x
and y
:
result = coo_matrix((T.data * x[T.row] * y[T.col], (T.row.copy(), T.col.copy())),
shape=T.shape)
The copy
calls avoid a few cases where modifying one of T
or result
might affect the other. You can remove them if you're sure you won't modify your matrices.
Also, be aware that this result
might have explicit zeros, particularly if x
or y
have any zeros.
Upvotes: 1