SixSigma
SixSigma

Reputation: 3260

What's the most efficient way to store a subset of column indices of big matrix and in C++?

I am working with a very big matrix X (say, 1,000-by-1,000,000). My algorithm goes like following:

  1. Scan the columns of X one by one, based on some filtering rules, to identify only a subset of columns that are needed. Denote the subset of indices of columns be S. Its size depends on the filter, so is unknown before computation and will change if the filtering rules are different.
  2. Loop over S, do some computation with a column x_i if i is in S. This step needs to be parallelized with openMP.
  3. Repeat 1 and 2 for 100 times with changed filtering rules, defined by a parameter.

I am wondering what the best way is to implement this procedure in C++. Here are two ways I can think of:

(a) Use a 0-1 array (with length 1,000,000) to indicate needed columns for Step 1 above; then in Step 2 loop over 1 to 1,000,000, use if-else to check indicator and do computation if indicator is 1 for that column;

(b) Use std::vector for S and push_back the column index if identified as needed; then only loop over S, each time extract column index from S and then do computation. (I thought about using this way, but it's said push_back is expensive if just storing integers.)

Since my algorithm is very time-consuming, I assume a little time saving in the basic step would mean a lot overall. So my question is, should I try (a) or (b) or other even better way for better performance (and for working with openMP)?

Any suggestions/comments for achieving better speedup are very appreciated. Thank you very much!

Upvotes: 0

Views: 91

Answers (3)

MikeMB
MikeMB

Reputation: 21136

(Based on my gut feeling) I'd probably go for pushing back into a vector, but the answer is quite simple: Measure both methods (they are both trivial to implement). Most likely you won't see a noticeable difference.

Upvotes: 0

James K. Lowden
James K. Lowden

Reputation: 7837

I think you will find std::vector easier to use. Regarding push_back, the cost is when the vector reallocates (and maybe copies) the data. To avoid that (if it matters), you could set vector::capacity to 1,000,000. Your vector is then 8 MB, insignificant compared to your problem size. It's only 1 order magnitude bigger than a bitmap would be, and a lot simpler to deal with: If we call your vector S and the nth interesting column i, then your column access is just x[S[i]].

Upvotes: 1

Mike Robinson
Mike Robinson

Reputation: 8945

To me, it seems that "step #1 really does not matter much." (At the end of the day, you're going to wind up with: "a set of columns, however represented.")

To me, what's really going to matter is: "just what's gonna happen when you unleash ("parallelized ...") step #2.

"An array of 'ones and zeros,'" however large, should be fairly simple for parallelization, while a more-'advanced' data structure might well, in this case, "just get in the way."

"One thousand mega-bits, these days?" Sure. Done. No problem. ("And if not, a simple array of bit-sets.") However-many simultaneously executing entities should be able to navigate such a data structure, in parallel, with a minimum of conflict . . . Therefore, to my gut, "big bit-sets win."

Upvotes: 1

Related Questions