Reputation: 2455
Setup
I'm aware of the fact that sparse matrices in scipy's .sparse
-module differ from numpy
-arrays. Also, I'm aware of questions like here regarding slicing of sparse arrays. Anyhow, this and most other questions deal with the performance of slicing.
My question rather deals with how to cope with their different slicing-behaviour. Lets create an example:
import numpy as np
from scipy import sparse
matrix = np.asarray([[0,0,0,1], [1,1,0,0], [1,0,1,0], [1,0,0,1], [1,0,0,1], [1,0,0,1]])
sparse_matrix = sparse.lil_matrix(matrix) # Or another format like .csr_matrix etc.
Given this setup, applying the same slice results in a different output:
matrix[:, 3]
# Output:
# array([ True, False, False, True, True, True], dtype=bool)
sparse_matrix[:, 3]
# Output:
# matrix([[ True],
# [False],
# [False],
# [ True],
# [ True],
# [ True]], dtype=bool)
Question
This is a bit of a bummer, since I need the first output to apply in the second case as well. As said in the beginning, I know that using sparse_matrix.A
etc. will give me the desired result. Anyhow, converting the sparse matrix to an array contradicts with the initial use-case of sparse-matrices.
So is there some possibility to achieve the same slice-result without converting sparse-matrix
to an array?
Edit:
For clarification, since my question might be confusing regarding this: The slice on the sparse_matrix
shall have the same output as matrix
, meaning that something like sparse_matrix[:, 3]
shall output ([ True, False, False, True, True, True])
.
Upvotes: 0
Views: 251
Reputation: 231385
In [150]: arr = np.asarray([[0,0,0,1], [1,1,0,0], [1,0,1,0], [1,0,0,1], [1,0,0,1], [1,0,0,1]])
...: M = sparse.lil_matrix(arr) # Or another format like .csr_matrix etc.
A scalar index on a ndarray
reduces the dimensions by one:
In [151]: arr[:,3]
Out[151]: array([1, 0, 0, 1, 1, 1])
It does not change the number of dimensions of the sparse matrix.
In [152]: M[:,3]
Out[152]:
<6x1 sparse matrix of type '<class 'numpy.int64'>'
with 4 stored elements in LInked List format>
This behavior is similar to that of np.matrix
subclass (and MATLAB). A sparse matrix is always 2d.
The dense array display of this matrix:
In [153]: M[:,3].A
Out[153]:
array([[1],
[0],
[0],
[1],
[1],
[1]], dtype=int64)
and the np.matrix
display:
In [154]: M[:,3].todense()
Out[154]:
matrix([[1],
[0],
[0],
[1],
[1],
[1]], dtype=int64)
np.matrix
has a A1
property which produces a 1d array (it converts to ndarray
and applies ravel
):
In [155]: M[:,3].todense().A1
Out[155]: array([1, 0, 0, 1, 1, 1], dtype=int64)
ravel
, squeeze
and scalar indexing are all ways of reducing the dimensions of a ndarray
. But they don't work directly on a np.matrix
or sparse matrix.
Another example of a 2d sparse matrix:
In [156]: sparse.lil_matrix(arr[:,3])
Out[156]:
<1x6 sparse matrix of type '<class 'numpy.int64'>'
with 4 stored elements in LInked List format>
In [157]: _.A
Out[157]: array([[1, 0, 0, 1, 1, 1]], dtype=int64)
Note the [[...]]
. sparse
has added a leading size 1 dimension to the 1d ndarray
.
Upvotes: 1