mskb
mskb

Reputation: 338

How to drop rows of an SpMat<unsigned int> in Armadillo based on a condition on row totals?

Is there an efficient approach to only retain rows of an Armadillo sparse matrix that sum up to at least some level of total count across columns of the matrix? For instance, I would want to retain the ith row, if the sum of its values is >=C, where C is some chosen value. Armadillo's documentation says that only contiguous submatrix views are allowed with sparse matrices. So I am guessing this is not easily obtainable by sub-setting. Is there an alternative to plainly looping through elements and creating a new sparse matrix with new locations, values and colPtr settings that match the desired condition? Thanks!

Upvotes: 2

Views: 1175

Answers (1)

Svaberg
Svaberg

Reputation: 1681

It may well be that the fastest executing solution is the one you propose. If you want to take advantage of high-level armadillo functionality (i.e. faster to code but perhaps slower to run) you can build a std::vector of "bad" rows ids and then use shed_row(id). Take care with the indexing when shedding rows. This is accomplished here by always shedding from the bottom of the matrix.

auto mat = arma::sp_mat(rowind, colptr, values, n_rows, n_cols)
auto threshold_value = 0.01 * arma::accu(sp_mat); // Sum of all elements

std::vector<arma::uword> bad_ids; // The rows that we want to shed 
auto row_sums = arma::sum(mat); // Row sums
// Iterate over rows in reverse order.
for (const arma::uword row_id = mat.nrows; i-- > 0; ) {
  if (row_sum(row_id) < threshold_value) {
    bad_ids.push_back(row_id);
  }
}
// Shed the bad rows from the bottom of the matrix and up.
for (const auto &bad_id : bad_ids) { 
  matrix.shed_row(bad_id);
}

Upvotes: 3

Related Questions