Alejandro Andrade
Alejandro Andrade

Reputation: 2216

Matrix multiplication in c++ armadillo is very slow

I'm doing some basic multiplication using armadillo but for some reason it takes very long to complete. I'm quite new to c++ so I might be doing something wrong, but I can't see it even in this very basic example:

#include <armadillo>
#include <iostream>

using namespace arma;

int main(){
    arma::vec coefficients = {1.0, 1.09, 1.08};
    arma::mat X = arma::mat(100000, 3, fill::randu) * coefficients;

    cout << X.n_cols;
}

when I mean very slow, I have run this example for some minutes and it doesn't finish

EDIT

I run the script with perf stat ./main, but stopped it after some time because it shouldn't take that long. This is the output.

^C./main: Interrupt

 Performance counter stats for './main':

        257,169.20 msec task-clock                #    1.003 CPUs utilized          
             3,342      context-switches          #   12.995 /sec                   
               215      cpu-migrations            #    0.836 /sec                   
             1,312      page-faults               #    5.102 /sec                   
   963,025,520,077      cycles                    #    3.745 GHz                    
   542,959,361,927      instructions              #    0.56  insn per cycle         
   113,002,342,332      branches                  #  439.409 M/sec                  
     1,095,168,312      branch-misses             #    0.97% of all branches        

     256.349026907 seconds time elapsed

     147.860947000 seconds user
     109.317743000 seconds sys

Upvotes: 2

Views: 841

Answers (1)

darcamo
darcamo

Reputation: 3493

Armadillo is a template-based library that can be used as a header-only library. Just include its header and make sure you link with some BLAS and LAPACK implementation. When used like this, armadillo assumes you have a BLAS and LAPACK implementation available. You will get link errors if you try to use any functionality in armadillo that requires them without linking with them. If you don't have BLAS and/or LAPACK, you can change the armadillo_bits/config.hpp file and comment out some defines there such that armadillo uses its own (slower) implementation of that functionality.

Alternatively, armadillo can be compiled as a wrapper library, where in that case you just link with the "armadillo" wrapper library. It's CMake code will try to determine during configure time what you have available and "comment-out the appropriated defines" in case you don't have some requirement available, which in turn will make it use the slower implementation. That "configure" code is wrongly determining that you don't have BLAS available, since BLAS is the one providing fast matrix multiplication.

My suggestion is to just make sure you have BLAS and LAPACK installed and use armadillo as a header-only library, making sure to link your program with BLAS and LAPACK.

Another option is using the conan package manager to install armadillo. Conan added a recipe to install armadillo recently. It has the advantage that it will install everything that armadillo needs (it installs openblas, which provides both a BLAS and LAPACK implementation) and it is system agnostic (similar to virtual environments in Python).


Note

In the comments you mentioned that it worked with g++ main.cpp -o main -DARMA_DONT_USE_WRAPPER -larmadillo -llapack. The reason is that even if you installed the wrapper library, if you define ARMA_DONT_USE_WRAPPER you are effectivelly using armadillo as a header-only library. You can replace -larmadillo -llapack with -lblas -llapack.

Upvotes: 2

Related Questions