DiveIntoML
DiveIntoML

Reputation: 2537

OpenMP does not provide speedup for simple program

I just started learning OpenMP with C++, and I used a very simple program to check if I can get some speedup from parallelize the program:

#include <iostream>
#include <ctime>
#include "omp.h"


int main() {
    const uint N = 1000000000;
    clock_t start_time = clock();
    #pragma omp parallel for
    for (uint i = 0; i < N; i++) {
        int x = 1+1;
    }

    clock_t end_time = clock();
    std::cout << "total_time: " << double(end_time - start_time) / CLOCKS_PER_SEC << " seconds." << std::endl;
}

The program takes 2.2 seconds without parallel #pragma, and takes 2.8 seconds with parallel #pragma 4 threads. What mistake did I make in the program? My compiler is clang++ 6.0, and the computer is Macbook Pro with 2.6G i5 CPU and MacOS 10.13.6.

EDIT:

I realized I used the wrong function for measuring execution time. Instead of clock() from library ctime, I should use high_resolution_clock from library chrono library. In that case, I get 80 seconds for 1 thread, 47 seconds for 2 threads, 35 seconds for 3 threads. Should the speedup be better than what I get here, since the program is embarrassingly parallel?

Upvotes: 0

Views: 302

Answers (1)

Andrew Fan
Andrew Fan

Reputation: 1322

As with anything in parallel programming, there is a startup cost to creating new threads. For simple programs, the overhead of creating and managing threads is often great enough that it actually slows down the target program compared to when the program is run in a single thread.

In other words, you didn't make a mistake - this is an inherent part of using threads.

Upvotes: 0

Related Questions