Matrix multiplication using multiple threads

Question

So I am trying to compute (M by N matrix) times (N by 1 vector) operations with threads into a resulting vector. The question in my book says that I should think about how many threads to use, and I assume since the result matrix should be M by 1 then I should use M threads, one for each set of operations. M is height, and N is width.

To create the threads I use

thread* myThreads = new thread[height];

Then I call the MatrixMultThreads function i times. At the end I join all the threads.

    for (int i = 0; i < height; i++)
    {
        myThreads[i] = thread(MatrixMultThreads, my2DArray, vector, height, width);
    }
    for (int i = 0; i < height; i++)
    {
        myThreads[i].join();
    }

What I am having trouble figuring out is how should I sum up all the resulting values in the correct order. How would I tell each specific thread what to do. I was thinking, maybe I should create a global variable step_i and set it to 0, then each time the function is called I can iterate that variable. then since I can pass the width of the array, I go through each step_i and add arr[i][j] * vector[j]

Elliott · Accepted Answer

What I am having trouble figuring out is how should I sum up all the resulting values in the correct order.

They can be summed out-of-order, which is why this is a good problem to solve with multi-threading. If ordering matters to a specific problem, you can't improve it with multithreading (to be clear, if any sub-problem can be solved out-of-order then that sub-problem is a potential candidate for multithreading).

One solution to your problem is to set up a solution vector at the call site, then pass the corresponding element by reference (also the MatrixMultiply function needs to know which problem it's solving):

void MatrixMultiply(const Array2d& matrix, 
       const vector& vec, int row, int& solution);

// ...

vector result(height);

for (int i = 0; i < height; i++)
{
    threads[i] = thread(MatrixMultiply, array2d, array1d, i, result[i]);
}

Your 2D array should really provide info on its height and width without having to pass these values explicitly.

BONUS INFO:

We could make this solution much more OOP in a way that you'll want to reuse for future problems (and some experienced programmers seem to miss this trick for using arrays):

MatrixMultiply function is really similar to a dot-product function:

template 
auto DotProduct(const V1& vec1, const V2& vec2)
{
    auto result = vec1[0] * vec2[0];

    for (size_t i = 1; i < vec1.size(); ++i)
        result += vec1[i] * vec2[i];

    return result;
}

template 
auto DotProduct(const V1& vec1, const V2& vec2, T& result)
{
    result = DotProduct(vec1, vec2);
}

(The above allows the vectors to be any objects that uses size() and [] methods as expected.)

We can write a wrapper class around std::vector that can be used by our array class to handle all the indexing for us; like this:

template 
class SubVector
{
    const typename std::vector::iterator m_it;
    const size_t m_size, m_interval_size;

public:

    SubVector (std::vector& v, size_t start, size_t sub_size, size_t i_size = 1)
        : m_it(v.begin() + start), m_size(sub_size), m_interval_size(i_size)
    {}
    
    auto size () const
    {
        return m_size;
    }
    
    const T& operator [] (size_t i) const
    {
        return it[i*m_interval_size];
    }

    T& operator [] (size_t i)
    {
        return it[i*m_interval_size];
    }
};

Then you could use this in some kind of Vectorise method in your array; like this:

template >
class Array2D
{
    std::vector m_data;

    size_t m_width, m_height;

public:

    // your normal methods  

    auto VectoriseRow(int r) const
    {
        return SubVector(m_data, r*m_width, m_width);
    }

    auto VectoriseColumn(int c) const
    {
        return SubVector(m_data, c, m_height, m_width);
    }
}

(Note: We could add the Vectorise feature to std::array or boost::multi_array by just writing a wrapper around them, which makes our array class more generic and saves us from having to do all the work. boost actually has this sort of feature inbuilt with array_view.)

Now our call site can be like so:

vector result(height);

for (int i = 0; i < height; i++)
{
    threads[i] = thread(DotProduct, array2d.VectoriseRow(i), array1d, result[i]);
}

This might seem like a more verbose way of solving the original problem (because it is), but if you use multi-dimensional arrays in your coding you'll find you no longer have to write multi-array-specific functions, or handle ugly indices for sub-problems (even in 1D problems, like Mean of Means). When dealing with those sorts of problems, you'll invariably want to reuse something like the above code.

Matrix multiplication using multiple threads

Answers (2)

Related Questions