Reputation: 29
Now,I need to realize sparse matrix slice in C++ using Eigen3.3.8.But I can't realize it
as fast as scipy in python.
for example,A is a sparse matrix of shape(1000000,1000000),index is a list I wanted to choose from A.
When I write as this:
A=A[index,:]
It spend 0.5114s.
When I write as this:
def extractor(indices,N)::
indptr=np.arange(len(indices)+1)
data=np.ones(len(indices))
shape=(len(indices),N)
return sparse.csr_matrix((data,indices,indptr), shape=shape)
A=extractor(index,A.shape[0])*A
it spend 3.381s.(this idea is from Sparse matrix slicing using list of int).
So How can I realize fast csr matrix slice?
Upvotes: 0
Views: 120
Reputation: 13295
As I've previously noted in a comment, the two methods you list are not equivalent. A[index]
with a list of indices changes the shape of the output. It can also permute or replicate rows. This can be realized reasonably fast like this:
using CsrMatrixD = Eigen::SparseMatrix<double, Eigen::RowMajor>;
CsrMatrixD rowsFromCsr(const Eigen::Ref<const CsrMatrixD>& in,
const Eigen::Ref<const Eigen::VectorXi>& indices)
{
using Triplet = Eigen::Triplet<double>;
using InnerIterator = Eigen::Ref<const CsrMatrixD>::InnerIterator;
std::vector<Triplet> triplets;
int outrow = 0;
for(int row: indices) {
for(InnerIterator nonzero(in, row); nonzero; ++nonzero)
triplets.emplace_back(outrow, nonzero.col(), nonzero.value());
++outrow;
}
CsrMatrixD rtrn(outrow, in.cols());
rtrn.setFromTriplets(triplets.begin(), triplets.end());
return rtrn;
}
This runs faster if the indices are sorted but should still work with any other arrangement. The InnerIterator may be new in Eigen-3.4.
The second option, where you multiply with a diagonal matrix, preserves the original shape and order of elements. It can be done like this:
CsrMatrixD rowsFromCsr(const Eigen::Ref<const CsrMatrixD>& in,
const Eigen::Ref<const Eigen::VectorXi>& indices)
{
Eigen::VectorXd diag = Eigen::VectorXd::Zero(in.rows());
for(int i: indices)
diag[i] = 1.;
return diag.asDiagonal() * in;
}
This is still reasonably fast. Sadly there doesn't seem to be a version that uses a sparse vector for the diagonal. Using the prune
method is faster in my tests:
CsrMatrixD rowsFromCsr(const Eigen::Ref<const CsrMatrixD>& in,
const Eigen::Ref<const Eigen::VectorXi>& indices)
{
std::vector<bool> bitmap(in.rows());
for(int row: indices)
bitmap[row] = true;
CsrMatrixD rtrn = in;
rtrn.prune([&bitmap](int row, int /*col*/, double /*value*/) noexcept -> bool {
return bitmap[row];
});
rtrn.makeCompressed();
return rtrn;
}
Upvotes: 1
Reputation: 231335
With a more modest sized matrix, my extractor code is slower, but only by 2x.
In [4]: M = sparse.random(1000,1000,.1,'csr')
In [5]: M
Out[5]:
<1000x1000 sparse matrix of type '<class 'numpy.float64'>'
with 100000 stored elements in Compressed Sparse Row format>
In [6]: idx = np.arange(100)
In [7]: M[idx,:]
Out[7]:
<100x1000 sparse matrix of type '<class 'numpy.float64'>'
with 10116 stored elements in Compressed Sparse Row format>
In [8]: timeit M[idx,:]
173 µs ± 514 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
The extractor:
In [10]: def extractor(indices,N):
...: indptr=np.arange(len(indices)+1)
...: data=np.ones(len(indices))
...: shape=(len(indices),N)
...: return sparse.csr_matrix((data,indices,indptr), shape=shape)
The nnz
match:
In [13]: extractor(idx, M.shape[0])*M
Out[13]:
<100x1000 sparse matrix of type '<class 'numpy.float64'>'
with 10116 stored elements in Compressed Sparse Row format>
Somewhat slower, but not as much as in your example.
In [14]: timeit extractor(idx, M.shape[0])*M
302 µs ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
This is larger than I tested 5yrs ago.
Since I have other things to do, I'm not going to try to test other cases - larger M
and longer idx
.
You are welcome to explore the sparse
code to see if they have changed the calculation, may be streamlining it in various ways that my reconstruction doesn't capture.
Upvotes: 1