Matrix Multiplication OpenMP Counter-Intuitive Results

Question

I am currently porting some code over to OpenMP at my place of work. One of the tasks I am doing is figuring out how to speed up matrix multiplication for one of our applications.

The matrices are stored in row-major format, so A[i*cols +j] gives the A_i_j element of the matrix A.

The code looks like this (uncommenting the pragma parallelises the code):

#include 
#include 
#include 
#include 

#define NUM_THREADS 8
#define size 500
#define num_iter 10

int main (int argc, char *argv[])
{
//    omp_set_num_threads(NUM_THREADS);

    int *A = new int [size*size];
    int *B = new int [size*size];
    int *C = new int [size*size];

    for (int i=0; i



What is confusing me is the following: why is dynamic scheduling faster than static scheduling for this task? Timing the runs and taking an average shows that static scheduling is slower, which to me is a bit counterintuitive since each thread is doing the same amount of work.

Also, am I correctly speeding up my matrix multiplication code?

Matrix Multiplication OpenMP Counter-Intuitive Results

Answers (1)

Related Questions