Armadillo SpMat extremely slow compared to Mat

Question

I am trying to utilize sparse matrices in Armadillo, and am noticing a significant difference in access times with SpMat compared to equivalent code using Mat.

Description:

Below are two methods, which are identical in every respect except that Method_One uses regular matrices and Method_Two uses sparse matrices.

Both methods take following arguments:

WS, DS: Pointers to a NN dimensional array
WW: 13 K [max(WS)]
DD: 1.7 K [max(DS)]
NN: 2.3 M
TT: 50

I am using Visual Studio 2017 for compiling the code into a .mexw64 executable which can be called from Matlab.

Code:

void Method_One(int WW, int DD, int TT, int NN, double* WS, double* DS)
{
    Mat WP(WW, TT, fill::zeros); // (13000 x 50) matrix
    Mat DP(DD, TT, fill::zeros); // (1700  x 50) matrix
    Col ZZ(NN, fill::zeros);     // 2,300,000 column vector

    for (int n = 0; n < NN; n++)
    {
        int w_n = (int) WS[n] - 1;
        int d_n = (int) DS[n] - 1;
        int t_n = rand() % TT;

        WP(w_n, t_n)++;
        DP(d_n, t_n)++;
        ZZ(n) = t_n + 1;
    }
    return;
}

void Method_Two(int WW, int DD, int TT, int NN, double* WS, double* DS)
{
    SpMat WP(WW, TT);        // (13000 x 50) matrix
    SpMat DP(DD, TT);        // (1700  x 50) matrix
    Col ZZ(NN, fill::zeros); // 2,300,000 column vector

    for (int n = 0; n < NN; n++)
    {
        int w_n = (int) WS[n] - 1;
        int d_n = (int) DS[n] - 1;
        int t_n = rand() % TT;

        WP(w_n, t_n)++;
        DP(d_n, t_n)++;
        ZZ(n) = t_n + 1;
    }
    return;
}

Timing:

I am timing both methods using wall_clock timer object in Armadillo. For example,

wall_clock timer;
timer.tic();
Method_One(WW, DD, TT, NN, WS, DS);
double t = timer.toc();

Results:

Timing elapsed for Method_One using Mat: 0.091 sec
Timing elapsed for Method_Two using SpMat: 30.227 sec (almost 300 times slower)

Any insights into this are highly appreciated!

UPDATE:

This issue has been fixed with newer version (8.100.1) of Armadillo.

Here are the new results:

Timing elapsed for Method_One using Mat: 0.141 sec
Timing elapsed for Method_Two using SpMat: 2.127 sec (15 times slower, which is acceptable!)

Thanks to Conrad and Ryan.

Tobias Ribizel · Accepted Answer

As hbrerkere already mentioned, the problem stems from the fact that the values of the matrix are stored in a packed format (CSC) that makes it time-consuming to

Find the index of an already existing entry: Depending on whether the column entries are sorted by their row index you need either linear or binary search.
Insert a value that was previously zero: Here you need to find the insertion point for your new value and move all elements after that, leading to Ω(n) worst case time for a single insertion!

All these operations are constant-time operations for dense matrices, which mostly explains the runtime difference.

My usual solution was to use a separate sparse matrix type for assembly (where you usually access an element multiple times) based on the coordinate format (storing triples (i, j, value)) that uses a map like std::map or std::unordered_map to store the triple index corresponding to a position (i,j) in the matrix.

Some similar approaches are also discussed in this question about matrix assembly

Example from my most recent use:

class DynamicSparseMatrix {
    using Number = double;
    using Index = std::size_t;
    using Entry = std::pair;
    std::vector values;
    std::vector rows;
    std::vector cols;
    std::map map; // unordered_map might be faster,
                                // but you need a suitable hash function
                                // like boost::hash for this.
    Index num_rows;
    Index num_cols;

    ...

    Number& value(Index row, Index col) {
        // just to prevent misuse
        assert(row >= 0 && row < num_rows);
        assert(col >= 0 && col < num_cols);
        // Find the entry in the matrix
        Entry e{row, col};
        auto it = map.find(e);
        // If the entry hasn't previously been stored
        if (it == map.end()) {
            // Add a new entry by adding its value and coordinates
            // to the end of the storage vectors.
            it = map.insert(make_pair(e, values.size())).first;
            rows.push_back(row);
            cols.push_back(col);
            values.push_back(0);
        }
        // Return the value
        return values[(*it).second];
    }

    ...

};

After assembly you can store all the values from rows, cols, values (which actually represent the matrix in Coordinate format), possibly sort them and do a batch insertion into your Armadillo matrix.

Armadillo SpMat<int> extremely slow compared to Mat<int>

Description:

Code:

Timing:

Results:

UPDATE:

Answers (2)

Related Questions

Armadillo SpMat&lt;int&gt; extremely slow compared to Mat&lt;int&gt;

Description:

Code:

Timing:

Results:

UPDATE:

Answers (2)

Related Questions

Armadillo SpMat<int> extremely slow compared to Mat<int>