John Baltimore
John Baltimore

Reputation: 97

Matrix multiplication using multiple threads

So I am trying to compute (M by N matrix) times (N by 1 vector) operations with threads into a resulting vector. The question in my book says that I should think about how many threads to use, and I assume since the result matrix should be M by 1 then I should use M threads, one for each set of operations. M is height, and N is width.

To create the threads I use

thread* myThreads = new thread[height];

Then I call the MatrixMultThreads function i times. At the end I join all the threads.

    for (int i = 0; i < height; i++)
    {
        myThreads[i] = thread(MatrixMultThreads, my2DArray, vector, height, width);
    }
    for (int i = 0; i < height; i++)
    {
        myThreads[i].join();
    }

What I am having trouble figuring out is how should I sum up all the resulting values in the correct order. How would I tell each specific thread what to do. I was thinking, maybe I should create a global variable step_i and set it to 0, then each time the function is called I can iterate that variable. then since I can pass the width of the array, I go through each step_i and add arr[i][j] * vector[j]

Upvotes: 0

Views: 1415

Answers (2)

Elliott
Elliott

Reputation: 2623

What I am having trouble figuring out is how should I sum up all the resulting values in the correct order.

They can be summed out-of-order, which is why this is a good problem to solve with multi-threading. If ordering matters to a specific problem, you can't improve it with multithreading (to be clear, if any sub-problem can be solved out-of-order then that sub-problem is a potential candidate for multithreading).

One solution to your problem is to set up a solution vector at the call site, then pass the corresponding element by reference (also the MatrixMultiply function needs to know which problem it's solving):

void MatrixMultiply(const Array2d& matrix, 
       const vector<int>& vec, int row, int& solution);

// ...

vector<int> result(height);

for (int i = 0; i < height; i++)
{
    threads[i] = thread(MatrixMultiply, array2d, array1d, i, result[i]);
}

Your 2D array should really provide info on its height and width without having to pass these values explicitly.


BONUS INFO:

We could make this solution much more OOP in a way that you'll want to reuse for future problems (and some experienced programmers seem to miss this trick for using arrays):

MatrixMultiply function is really similar to a dot-product function:

template <typename V1, typename V2>
auto DotProduct(const V1& vec1, const V2& vec2)
{
    auto result = vec1[0] * vec2[0];

    for (size_t i = 1; i < vec1.size(); ++i)
        result += vec1[i] * vec2[i];

    return result;
}

template <typename V1, typename V2, typename T>
auto DotProduct(const V1& vec1, const V2& vec2, T& result)
{
    result = DotProduct(vec1, vec2);
}

(The above allows the vectors to be any objects that uses size() and [] methods as expected.)

We can write a wrapper class around std::vector that can be used by our array class to handle all the indexing for us; like this:

template <typename T, typename A>
class SubVector
{
    const typename std::vector<T,A>::iterator m_it;
    const size_t m_size, m_interval_size;

public:

    SubVector (std::vector<T,A>& v, size_t start, size_t sub_size, size_t i_size = 1)
        : m_it(v.begin() + start), m_size(sub_size), m_interval_size(i_size)
    {}
    
    auto size () const
    {
        return m_size;
    }
    
    const T& operator [] (size_t i) const
    {
        return it[i*m_interval_size];
    }

    T& operator [] (size_t i)
    {
        return it[i*m_interval_size];
    }
};

Then you could use this in some kind of Vectorise method in your array; like this:

template <typename T, typename A = std::allocator<T>>
class Array2D
{
    std::vector<T,A> m_data;

    size_t m_width, m_height;

public:

    // your normal methods  

    auto VectoriseRow(int r) const
    {
        return SubVector(m_data, r*m_width, m_width);
    }

    auto VectoriseColumn(int c) const
    {
        return SubVector(m_data, c, m_height, m_width);
    }
}

(Note: We could add the Vectorise feature to std::array or boost::multi_array by just writing a wrapper around them, which makes our array class more generic and saves us from having to do all the work. boost actually has this sort of feature inbuilt with array_view.)

Now our call site can be like so:

vector<int> result(height);

for (int i = 0; i < height; i++)
{
    threads[i] = thread(DotProduct, array2d.VectoriseRow(i), array1d, result[i]);
}

This might seem like a more verbose way of solving the original problem (because it is), but if you use multi-dimensional arrays in your coding you'll find you no longer have to write multi-array-specific functions, or handle ugly indices for sub-problems (even in 1D problems, like Mean of Means). When dealing with those sorts of problems, you'll invariably want to reuse something like the above code.

Upvotes: 1

Laurent Jospin
Laurent Jospin

Reputation: 614

You can store the results of the rows dot the Nx1 vector in a Mx1 vector and then do the sum.

By the way, you would be much better using OpenMP for such a problem, it would automatize most of your threads managements according to the number of cores on your machine, since here you might spawn a lot of threads:

https://www.openmp.org/

http://www.bowdoin.edu/~ltoma/teaching/cs3225-GIS/fall17/Lectures/openmp.html

Upvotes: 1

Related Questions