kedarps
kedarps

Reputation: 861

Armadillo SpMat<int> extremely slow compared to Mat<int>

I am trying to utilize sparse matrices in Armadillo, and am noticing a significant difference in access times with SpMat<int> compared to equivalent code using Mat<int>.

Description:

Below are two methods, which are identical in every respect except that Method_One uses regular matrices and Method_Two uses sparse matrices.

Both methods take following arguments:

I am using Visual Studio 2017 for compiling the code into a .mexw64 executable which can be called from Matlab.

Code:

void Method_One(int WW, int DD, int TT, int NN, double* WS, double* DS)
{
    Mat<int> WP(WW, TT, fill::zeros); // (13000 x 50) matrix
    Mat<int> DP(DD, TT, fill::zeros); // (1700  x 50) matrix
    Col<int> ZZ(NN, fill::zeros);     // 2,300,000 column vector

    for (int n = 0; n < NN; n++)
    {
        int w_n = (int) WS[n] - 1;
        int d_n = (int) DS[n] - 1;
        int t_n = rand() % TT;

        WP(w_n, t_n)++;
        DP(d_n, t_n)++;
        ZZ(n) = t_n + 1;
    }
    return;
}

void Method_Two(int WW, int DD, int TT, int NN, double* WS, double* DS)
{
    SpMat<int> WP(WW, TT);        // (13000 x 50) matrix
    SpMat<int> DP(DD, TT);        // (1700  x 50) matrix
    Col<int> ZZ(NN, fill::zeros); // 2,300,000 column vector

    for (int n = 0; n < NN; n++)
    {
        int w_n = (int) WS[n] - 1;
        int d_n = (int) DS[n] - 1;
        int t_n = rand() % TT;

        WP(w_n, t_n)++;
        DP(d_n, t_n)++;
        ZZ(n) = t_n + 1;
    }
    return;
}

Timing:

I am timing both methods using wall_clock timer object in Armadillo. For example,

wall_clock timer;
timer.tic();
Method_One(WW, DD, TT, NN, WS, DS);
double t = timer.toc();

Results:


Any insights into this are highly appreciated!



UPDATE:

This issue has been fixed with newer version (8.100.1) of Armadillo.

Here are the new results:

Thanks to Conrad and Ryan.

Upvotes: 2

Views: 681

Answers (2)

Tobias Ribizel
Tobias Ribizel

Reputation: 5421

As hbrerkere already mentioned, the problem stems from the fact that the values of the matrix are stored in a packed format (CSC) that makes it time-consuming to

  1. Find the index of an already existing entry: Depending on whether the column entries are sorted by their row index you need either linear or binary search.

  2. Insert a value that was previously zero: Here you need to find the insertion point for your new value and move all elements after that, leading to Ω(n) worst case time for a single insertion!

All these operations are constant-time operations for dense matrices, which mostly explains the runtime difference.

My usual solution was to use a separate sparse matrix type for assembly (where you usually access an element multiple times) based on the coordinate format (storing triples (i, j, value)) that uses a map like std::map or std::unordered_map to store the triple index corresponding to a position (i,j) in the matrix.

Some similar approaches are also discussed in this question about matrix assembly

Example from my most recent use:

class DynamicSparseMatrix {
    using Number = double;
    using Index = std::size_t;
    using Entry = std::pair<Index, Index>;
    std::vector<Number> values;
    std::vector<Index> rows;
    std::vector<Index> cols;
    std::map<Entry, Index> map; // unordered_map might be faster,
                                // but you need a suitable hash function
                                // like boost::hash<Entry> for this.
    Index num_rows;
    Index num_cols;

    ...

    Number& value(Index row, Index col) {
        // just to prevent misuse
        assert(row >= 0 && row < num_rows);
        assert(col >= 0 && col < num_cols);
        // Find the entry in the matrix
        Entry e{row, col};
        auto it = map.find(e);
        // If the entry hasn't previously been stored
        if (it == map.end()) {
            // Add a new entry by adding its value and coordinates
            // to the end of the storage vectors.
            it = map.insert(make_pair(e, values.size())).first;
            rows.push_back(row);
            cols.push_back(col);
            values.push_back(0);
        }
        // Return the value
        return values[(*it).second];
    }

    ...

};

After assembly you can store all the values from rows, cols, values (which actually represent the matrix in Coordinate format), possibly sort them and do a batch insertion into your Armadillo matrix.

Upvotes: 2

hbrerkere
hbrerkere

Reputation: 1615

Sparse matrices are stored in a compressed format (CSC). Every time a non-zero element inserted into a sparse matrix, the entire internal representation has to be updated. This is time consuming.

It's much faster to construct the sparse matrix using batch constructors.

Upvotes: 1

Related Questions