Reputation: 1227
I am working on an algorithm in Rcpp (Eigen) that requires the equivalent of cbind for matrices. I've found that R's cbind is extremely fast, and using Eigen is extremely slow. I'd like to find a way to optimize this function, so I can keep my algo in Rcpp. So far I have found this other link, but it is for cbind on dense matrices
#include <RcppEigen.h>
// [[Rcpp::depends(RcppEigen)]]
using namespace Rcpp;
using namespace Eigen;
// [[Rcpp::export]]
Eigen::SparseMatrix<double> RcppMatrixCbind(Eigen::MappedSparseMatrix<double>& matrix1,
Eigen::MappedSparseMatrix<double>& matrix2) {
SparseMatrix<double> out(matrix1.rows(), matrix1.cols() + matrix2.cols());
std::vector<Triplet<double> > tripletList;
tripletList.reserve(matrix1.nonZeros() + matrix2.nonZeros());
for (int k = 0; k < matrix1.outerSize(); ++k)
{
for (MappedSparseMatrix<double>::InnerIterator it(matrix1, k); it; ++it)
{
tripletList.push_back(Triplet<double>(it.row(), it.col(), it.value()));
}
}
for (int k = 0; k < matrix2.outerSize(); ++k)
{
for (MappedSparseMatrix<double>::InnerIterator it(matrix2, k); it; ++it)
{
tripletList.push_back(Triplet<double>(it.row(), it.col() + matrix1.cols(), it.value()));
}
}
out.setFromTriplets(tripletList.begin(), tripletList.end());
return out;
}
/*** R
require(Matrix)
x = rsparsematrix(10000, 500, .1)
system.time(foo <- cbind(x,x))
system.time(bar <- RcppMatrixCbind(x, x))
*/
Upvotes: 1
Views: 589
Reputation: 368509
That's really more of an Eigen question: how to expand a sparse matrix?
What you did in your solution is doing everything element-by-element, and it is likely that a dedicated blockwise operation can beat it. Which is what we see with the Matrix
solution, going to efficient code, presumably from CHOLMD.
I had a quick look at the Eigen documentation. And it warns:
Regarding read-access, sparse matrices expose the same API than for dense matrices to access to sub-matrices such as blocks, columns, and rows. See Block operations for a detailed introduction. However, for performance reasons, writing to a sub-sparse-matrix is much more limited, and currently only contiguous sets of columns (resp. rows) of a column-major (resp. row-major) SparseMatrix are writable. Moreover, this information has to be known at compile-time, leaving out methods such as
block(...)
andcorner*(...)
.
But we're in luck as cbind()
in blocky enough. A really simple solution then is
// [[Rcpp::export]]
Eigen::SparseMatrix<double>
RcppMatrixCb2(Eigen::MappedSparseMatrix<double>& matrix1,
Eigen::MappedSparseMatrix<double>& matrix2) {
SparseMatrix<double> out(matrix1.rows(), matrix1.cols() + matrix2.cols());
out.leftCols(matrix1.cols()) = matrix1;
out.rightCols(matrix2.cols()) = matrix2;
return out;
}
which beats both previous attempts:
> benchmark(Matrix=cbind(x,x),
+ prevSol=RcppMatrixCbind(x,x),
+ newSol=RcppMatrixCb2(x,x),
+ order="relative")[,1:4]
test replications elapsed relative
3 newSol 100 0.585 1.000
1 Matrix 100 0.686 1.173
2 prevSol 100 4.603 7.868
>
My complete file is below. It lacks a test in both functions: we need to ensure matrices one and two have the same rows.
// cf https://stackoverflow.com/questions/45875668/rcpp-eigen-sparse-matrix-cbind
#include <RcppEigen.h>
// [[Rcpp::depends(RcppEigen)]]
using namespace Rcpp;
using namespace Eigen;
// [[Rcpp::export]]
Eigen::SparseMatrix<double> RcppMatrixCbind(Eigen::MappedSparseMatrix<double>& matrix1,
Eigen::MappedSparseMatrix<double>& matrix2) {
SparseMatrix<double> out(matrix1.rows(), matrix1.cols() + matrix2.cols());
std::vector<Triplet<double> > tripletList;
tripletList.reserve(matrix1.nonZeros() + matrix2.nonZeros());
for (int k = 0; k < matrix1.outerSize(); ++k) {
for (MappedSparseMatrix<double>::InnerIterator it(matrix1, k); it; ++it) {
tripletList.push_back(Triplet<double>(it.row(), it.col(), it.value()));
}
}
for (int k = 0; k < matrix2.outerSize(); ++k) {
for (MappedSparseMatrix<double>::InnerIterator it(matrix2, k); it; ++it) {
tripletList.push_back(Triplet<double>(it.row(), it.col() + matrix1.cols(), it.value()));
}
}
out.setFromTriplets(tripletList.begin(), tripletList.end());
return out;
}
// [[Rcpp::export]]
Eigen::SparseMatrix<double> RcppMatrixCb2(Eigen::MappedSparseMatrix<double>& matrix1,
Eigen::MappedSparseMatrix<double>& matrix2) {
SparseMatrix<double> out(matrix1.rows(), matrix1.cols() + matrix2.cols());
out.leftCols(matrix1.cols()) = matrix1;
out.rightCols(matrix2.cols()) = matrix2;
return out;
}
/*** R
require(Matrix)
set.seed(42)
x = rsparsematrix(10000, 500, .1)
library(rbenchmark)
benchmark(Matrix=cbind(x,x),
prevSol=RcppMatrixCbind(x,x),
newSol=RcppMatrixCb2(x,x),
order="relative")[,1:4]
*/
Upvotes: 6