Why the average speed of n threads is not as fast as one single thread in C?

Question

I wrote a program with 2 threads doing the same thing but I found the throughput of each threads is slower than if I only spawn one thread. Then I write this simple test to see if that's my problem or it's because of the system.

#include 
#include 
#include 
#include 


/*
 * Function: run_add
 * -----------------------
 * Do addition operation for iteration ^ 3 times
 *
 * returns: void
 */
void *run_add(void *ptr) {
  clock_t t1, t2;
  t1 = clock();

  int sum = 0;
  int i = 0, j = 0, k = 0;
  int iteration = 1000;
  long total = iteration * iteration * iteration;
  for (i = 0; i < iteration; i++) {
    for (j = 0; j < iteration; j++) {
      for (k = 0; k < iteration; k++) {
        sum++;
      }
    }
  }

  t2 = clock();
  float diff = ((float)(t2 - t1) / 1000000.0F );
  printf("thread id = %d
", (int)(pthread_self()));
  printf("Total addtions: %ld
", total);
  printf("Total time: %f second
", diff);
  printf("Addition per second: %f
", total / diff);
  printf("
");

  return NULL;
}


void run_test(int num_thread) {
  pthread_t pth_arr[num_thread];
  int i = 0;
  for (i = 0; i < num_thread; i++) {
    pthread_create(&pth_arr[i], NULL, run_add, NULL);
  }

  for (i = 0; i < num_thread; i++) {
    pthread_join(pth_arr[i], NULL);
  }
}

int main() {
  int num_thread = 5;
  int i = 0;
  for (i = 1; i < num_thread; i++) {
    printf("Running SUM with %d threads. 

", i);
    run_test(i);
  }
  return 0;
}

The result still shows the average speed of n threads is slower than one single thread. The more threads I have, the slower each one is.

Here's the result:

Running SUM with 1 threads.

thread id = 528384, Total addtions: 1000000000, Total time: 1.441257 second, Addition per second: 693838784.000000

Running SUM with 2 threads.

thread id = 528384, Total addtions: 1000000000, Total time: 2.970870 second, Addition per second: 336601728.000000

thread id = 1064960, Total addtions: 1000000000, Total time: 2.972992 second, Addition per second: 336361504.000000

Running SUM with 3 threads.

thread id = 1064960, Total addtions: 1000000000, Total time: 4.434701 second, Addition per second: 225494352.000000

thread id = 1601536, Total addtions: 1000000000, Total time: 4.449250 second, Addition per second: 224756976.000000

thread id = 528384, Total addtions: 1000000000, Total time: 4.454826 second, Addition per second: 224475664.000000

Running SUM with 4 threads.

thread id = 528384, Total addtions: 1000000000, Total time: 6.261967 second, Addition per second: 159694224.000000

thread id = 1064960, Total addtions: 1000000000, Total time: 6.293107 second, Addition per second: 158904016.000000

thread id = 2138112, Total addtions: 1000000000, Total time: 6.295047 second, Addition per second: 158855056.000000

thread id = 1601536, Total addtions: 1000000000, Total time: 6.306261 second, Addition per second: 158572560.000000

I have a 4-core CPU and my system monitor shows each time I ran n threads, n CPU cores are 100% utilized. Is it true that n threads(<= my CPU cores) are supposed to run n times as fast as one thread? Why it is not the case here?

Jasen · Accepted Answer

clock() measures CPU time not "Wall" time. it also measures the total time of all threads..

CPU time is time when the processor was executing you code, wall time is real world elapsed time (like a clock on the wall would show)

time your program using /usr/bin/time to see what's really happening. or use a wall-time function like time(), gettimeofday() or clock_gettime()

clock_gettime() can measure CPU time for this thread, for this process, or wall time. - it's probably the best way to do this type of experiment.

Why the average speed of n threads is not as fast as one single thread in C?

Answers (2)

Related Questions