Reputation: 338
Is there an efficient approach to only retain rows of an Armadillo sparse matrix that sum up to at least some level of total count across columns of the matrix? For instance, I would want to retain the i
th row, if the sum of its values is >=C
, where C
is some chosen value. Armadillo's documentation says that only contiguous submatrix views are allowed with sparse matrices. So I am guessing this is not easily obtainable by sub-setting. Is there an alternative to plainly looping through elements and creating a new sparse matrix with new locations, values and colPtr settings that match the desired condition? Thanks!
Upvotes: 2
Views: 1175
Reputation: 1681
It may well be that the fastest executing solution is the one you propose. If you want to take advantage of high-level armadillo functionality (i.e. faster to code but perhaps slower to run) you can build a std::vector
of "bad" rows ids and then use shed_row(id)
. Take care with the indexing when shedding rows. This is accomplished here by always shedding from the bottom of the matrix.
auto mat = arma::sp_mat(rowind, colptr, values, n_rows, n_cols)
auto threshold_value = 0.01 * arma::accu(sp_mat); // Sum of all elements
std::vector<arma::uword> bad_ids; // The rows that we want to shed
auto row_sums = arma::sum(mat); // Row sums
// Iterate over rows in reverse order.
for (const arma::uword row_id = mat.nrows; i-- > 0; ) {
if (row_sum(row_id) < threshold_value) {
bad_ids.push_back(row_id);
}
}
// Shed the bad rows from the bottom of the matrix and up.
for (const auto &bad_id : bad_ids) {
matrix.shed_row(bad_id);
}
Upvotes: 3