C++ Matrix product: increased speed with little changes

Question

I'm writing a code to multiply a sparse matrix with a full matrix.

I've created 2 class: SparseMatrix and Matrix, which store datas as a vector of shared pointers to vectors. In the SparseMatrix case i save item as an object, called SparseMatrixItem with 2 attributes: position and values. In the Matrix case I simply save the value. They can both be row or column based, by the value of a bool attribute.

Now I'm trying to write an efficient version of the standard product between the 2 matrixes. By semplicity in the first implementation I consider only the case in which the first matrix is a row based SparseMatrix and the second is a column based Matrix. I write the code into the class SparseMatrix by overloading the operator *.

I post my implementation:

template 
Matrix SparseVectorMatrix::operator*(Matrix &input2) {
    Matrix newMatrix(getNumberOfRows(),input2.getNumberOfColumns(),true);
    int numberOfRow=newMatrix.getNumberOfRows();
    int numberOfColumn=newMatrix.getNumberOfColumns();

    for (int i=0; i>& readRow(*horizontalVectorMatrix[i]);
    vector& writeRow(*newMatrix.internalMatrix[i]);

        for (int j=0; j& readColumn1(*input2.internalMatrix[j]);
            writeRow[j]=fastScalarProduct(readRow, readColumn1);

        }
    }
}

The strange fact I cannot figure out is that if I change the 2 loop order performance are dramatically faster. I test it with 2 matrix: 6040x4000 and 4000*6040, the first implementation tooks nearly 30 seconds,while the second implementation tooks only 12 seconds. I post it:

template 
Matrix SparseVectorMatrix::operator*(Matrix &input2) {
    Matrix newMatrix(getNumberOfRows(),input2.getNumberOfColumns(),true);
    int numberOfRow=newMatrix.getNumberOfRows();
    int numeroColonne=newMatrix.getNumberOfColumns();

    for (int j=0; j& readColumn(*input2.internalMatrix[j]);
        vector& writeColumn(*newMatrix.internalMatrix[j]);

        for (int i=0; i>& readRow(*matriceOrizzontaleVettori[i]);
            writeColumn[i]=fastScalarProduct(readRow, readColumn);
        }
    }
}

I post also the code of the function fastScalarProduct() that I use:

template 
scalar SparseVectorMatrix::fastScalarProduct
    ( vector> &vector1
    , const vector &vector2
    ) {
    int totalSum=0;
    int position;
    auto sizeVector1=vector1.size();

    for (int i=0; i



I try the same product with MATLAB and it takes only 1.5 seconds more or less. I think that there are issues with cache memory, but since I'm new to this kind of problems I cannot figure out the real problem.

I'm also trying to write an efficient full matrix product, and I'm facing the same problems.

C++ Matrix product: increased speed with little changes

Answers (1)

Related Questions