Reputation: 861
I am trying to utilize sparse matrices in Armadillo, and am noticing a significant difference in access times with SpMat<int>
compared to equivalent code using Mat<int>
.
Below are two methods, which are identical in every respect except that Method_One
uses regular matrices and Method_Two
uses sparse matrices.
Both methods take following arguments:
WS, DS
: Pointers to a NN
dimensional arrayWW
: 13 K [max(WS)
]DD
: 1.7 K [max(DS)
]NN
: 2.3 MTT
: 50I am using Visual Studio 2017 for compiling the code into a .mexw64
executable which can be called from Matlab
.
void Method_One(int WW, int DD, int TT, int NN, double* WS, double* DS)
{
Mat<int> WP(WW, TT, fill::zeros); // (13000 x 50) matrix
Mat<int> DP(DD, TT, fill::zeros); // (1700 x 50) matrix
Col<int> ZZ(NN, fill::zeros); // 2,300,000 column vector
for (int n = 0; n < NN; n++)
{
int w_n = (int) WS[n] - 1;
int d_n = (int) DS[n] - 1;
int t_n = rand() % TT;
WP(w_n, t_n)++;
DP(d_n, t_n)++;
ZZ(n) = t_n + 1;
}
return;
}
void Method_Two(int WW, int DD, int TT, int NN, double* WS, double* DS)
{
SpMat<int> WP(WW, TT); // (13000 x 50) matrix
SpMat<int> DP(DD, TT); // (1700 x 50) matrix
Col<int> ZZ(NN, fill::zeros); // 2,300,000 column vector
for (int n = 0; n < NN; n++)
{
int w_n = (int) WS[n] - 1;
int d_n = (int) DS[n] - 1;
int t_n = rand() % TT;
WP(w_n, t_n)++;
DP(d_n, t_n)++;
ZZ(n) = t_n + 1;
}
return;
}
I am timing both methods using wall_clock
timer object in Armadillo. For example,
wall_clock timer;
timer.tic();
Method_One(WW, DD, TT, NN, WS, DS);
double t = timer.toc();
Method_One
using Mat<int>
: 0.091 sec
Method_Two
using SpMat<int>
: 30.227 sec
(almost 300 times slower)Any insights into this are highly appreciated!
This issue has been fixed with newer version (8.100.1) of Armadillo.
Here are the new results:
Method_One
using Mat<int>
: 0.141 sec
Method_Two
using SpMat<int>
: 2.127 sec
(15 times slower, which is acceptable!)Thanks to Conrad and Ryan.
Upvotes: 2
Views: 681
Reputation: 5421
As hbrerkere already mentioned, the problem stems from the fact that the values of the matrix are stored in a packed format (CSC) that makes it time-consuming to
Find the index of an already existing entry: Depending on whether the column entries are sorted by their row index you need either linear or binary search.
Insert a value that was previously zero: Here you need to find the insertion point for your new value and move all elements after that, leading to Ω(n) worst case time for a single insertion!
All these operations are constant-time operations for dense matrices, which mostly explains the runtime difference.
My usual solution was to use a separate sparse matrix type for assembly (where you usually access an element multiple times) based on the coordinate format (storing triples (i, j, value)) that uses a map like std::map
or std::unordered_map
to store the triple index corresponding to a position (i,j)
in the matrix.
Some similar approaches are also discussed in this question about matrix assembly
Example from my most recent use:
class DynamicSparseMatrix {
using Number = double;
using Index = std::size_t;
using Entry = std::pair<Index, Index>;
std::vector<Number> values;
std::vector<Index> rows;
std::vector<Index> cols;
std::map<Entry, Index> map; // unordered_map might be faster,
// but you need a suitable hash function
// like boost::hash<Entry> for this.
Index num_rows;
Index num_cols;
...
Number& value(Index row, Index col) {
// just to prevent misuse
assert(row >= 0 && row < num_rows);
assert(col >= 0 && col < num_cols);
// Find the entry in the matrix
Entry e{row, col};
auto it = map.find(e);
// If the entry hasn't previously been stored
if (it == map.end()) {
// Add a new entry by adding its value and coordinates
// to the end of the storage vectors.
it = map.insert(make_pair(e, values.size())).first;
rows.push_back(row);
cols.push_back(col);
values.push_back(0);
}
// Return the value
return values[(*it).second];
}
...
};
After assembly you can store all the values from rows
, cols
, values
(which actually represent the matrix in Coordinate format), possibly sort them and do a batch insertion into your Armadillo matrix.
Upvotes: 2
Reputation: 1615
Sparse matrices are stored in a compressed format (CSC). Every time a non-zero element inserted into a sparse matrix, the entire internal representation has to be updated. This is time consuming.
It's much faster to construct the sparse matrix using batch constructors.
Upvotes: 1