user1000107
user1000107

Reputation: 415

Can boost::atomic really improve performance by reducing overhead of sys calls (in mutex/semaphore) in multithreading?

I am trying to compare the performance of boost::atomic and pthread mutex on Linux:

 pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER ;
 int g = 0 ;

 void f()
 {

    pthread_mutex_lock(&mutex);
    ++g;
    pthread_mutex_unlock(&mutex);
    return ;
 }
 const int threadnum = 100;
 int main()  
 {
    boost::threadpool::fifo_pool tp(threadnum);
    for (int j = 0 ; j < 100 ; ++j)
    {
            for (int i = 0 ; i < threadnum ; ++i)
                    tp.schedule(boost::bind(f));
            tp.wait();
    }
    std::cout << g << std::endl ;
    return 0 ; 
 }

its time:

 real    0m0.308s
 user    0m0.176s
 sys     0m0.324s

I also tried boost::atomic:

 boost::atomic<int> g(0) ;

 void f()
 {

      ++g;
    return ;
  }
  const int threadnum = 100;
  int main()
  {
    boost::threadpool::fifo_pool tp(threadnum);
    for (int j = 0 ; j < 100 ; ++j)
    {
            for (int i = 0 ; i < threadnum ; ++i)
                    tp.schedule(boost::bind(f));
            tp.wait() ;
    }
    std::cout << g << std::endl ;
    return 0 ;
   }

its time:

 real    0m0.344s
 user    0m0.250s
 sys     0m0.344s

I run them many times but the timing results are similar.

Can atomic really help avoid overhead of sys calls caused by mutex/semaphore ?

Any help will be appreciated.

Thanks

UPDATE : increase the loop number to 1000000 for

    for (int i = 0 ; i < 1000000 ; ++i)
    {
            pthread_mutex_lock(&mutex);
            ++g;
            pthread_mutex_unlock(&mutex);
    }

similar to boost::atomic .

test the time by "time ./app"

use boost:atomic:

real    0m13.577s
user    1m47.606s
sys     0m0.041s

use pthread mutex:

real    0m17.478s
user    0m8.623s
sys     2m10.632s

it seems that boost:atomic is faster because pthread use more time for sys calls.

Why user time + sys is larger than real time ?

Any comments are welcome !

Upvotes: 3

Views: 2031

Answers (1)

I guess you're not correctly measuring the time taken by atomics vs mutexes. Instead, you're measuring the overhead incurred by the boost thread pool management: it takes more time to setup a new task f() than executing the task itself.

I suggest you add another loop in f() to obtain something like this (do the same for the atomic version)

 void f()
 {
    for(int i = 0  ; i < 10000 ; i++) {
      pthread_mutex_lock(&mutex);
      ++g;
      pthread_mutex_unlock(&mutex);
    }
    return ;
 }

Please post the score if something changed, I'd interested to see the difference !

Upvotes: 5

Related Questions