How to do a matrix multiplication using and a 1-D array in c++?

Question

I am trying to do a matrix multiplication using threads. But I do not get correct values. Since the matrix can be very large I use the heap memory. The matrix is thus stored in a 1-D array.

The matrix is always a square matrix thus the number of rows and the number of columns is equal to the square root of the array length. If the array length is 16 then the numbers of rows is 4 and the number of columns is also 4.

I can not use an std::vector so that's why the use of std::unique_ptr.

There are 4 threads and each of them receives 1/4 of the original array to work with. This doesn't work due to the nature of matrix multiplication and I can't seem to find the right solution. How can I split the task for 4 threads?

auto matrixmultiplication(float* &array1, float* &array2, int arrayLength) {
    unique_ptr arrayOut(new float[arrayLength]);
    auto numberOfThreads = 4;
    auto widthMatrix = (int)sqrt(arrayLength);
    auto elementsPerThread = (int)sqrt(arrayLength / numberOfThreads);

    auto mul = [](auto* array1, auto* array2, auto* array3, auto dimension) {
        for (auto x = 0; x < dimension; x++) {
            for (auto y = 0; y < dimension; y++) {
                array3[dimension * x + y] = 0;
                for (auto z = 0; z < dimension; z++) {
                    array3[dimension * x + y] += array1[dimension * x + z] * array2[dimension * z + y];
                }
            }
        }
    };

    vector threads;
    for (auto i = 0; i < numberOfThreads; i++) {
        threads.push_back(
            thread(
                mul,
                array1 + i * elementsPerThread,
                array2,
                arrayOut.get() + i * elementsPerThread,
                elementsPerThread
            )
        );
    }
    for (auto &thread : threads) {
        thread.join();
    }
    return arrayOut;
};

Akira · Accepted Answer

For all threads I would start the processing from consecutive rows of the first matrix , i.e. the 0th thread will process the 0th row, the 1st will process 1st row, and so on to the nth thread.

After a thread processed a row it has to jump to the next row by the number of threads, i.e. if I have 2 threads, after the 0th finished processig the 0th row, it will jump to the 2nd row and process it.

Let's see it in a working example:

#include 
#include 
#include 
#include 

// multiplies the specified row and column from specified matrices
void multiply(const int* m_1, const int* m_2,
        std::size_t size, std::size_t row, std::size_t col, int* m_res) {
    for(std::size_t i = 0; i < size; ++i)
        m_res[row * size + col] += m_1[row * size + i] * m_2[i * size + col];
}

int main() {
    constexpr int N = 3, THREAD_NUM = 2;

    // matrices to multiply and a matrix for result
    std::unique_ptr A(new int[N * N] {
        11, 12, 13, 21, 22, 23, 31, 32, 33
    });
    std::unique_ptr B(new int[N * N] {
        1, 0, 0, 0, 1, 0, 0, 0, 1
    });
    std::unique_ptr C(new int[N * N] {});

    // create vector for running threads then assign threads to its elements
    std::vector thread_group(THREAD_NUM);

    for(int thread_i = 0; thread_i < THREAD_NUM; ++thread_i)
        thread_group[thread_i] = std::thread([&, thread_i]() {

            // each thread stars from consecutive rows then steps by 
            // the number of threads
            for(int row = thread_i; row < N; row += THREAD_NUM) {
                for(int col = 0; col < N; ++col)
                    multiply(A.get(), B.get(), N, row, col, C.get());
            }
        });

    for(auto& t : thread_group)
        t.join();

    // show the result
    for(int i = 0; i < N; ++i) {
        for(int j = 0; j < N; ++j)
            std::cout << (j ? "	" : "") << C[i * N + j];
        std::cout << std::endl;
    }
}

How to do a matrix multiplication using <threads> and a 1-D array in c++?

Answers (2)

Related Questions

How to do a matrix multiplication using &lt;threads&gt; and a 1-D array in c++?

Answers (2)

Related Questions

How to do a matrix multiplication using <threads> and a 1-D array in c++?