lechuck
lechuck

Reputation: 23

C++ Different errors using different gcc optimizations

Using different gcc optimizations my program dies due different OS signals and I wonder if the cause is the same or not.

I was getting a core dump due a abort() in a c++ multithread program compiled using O2.

Program terminated with signal 6, Aborted.
#0  0x00007ff2572d28a5 in raise () from /lib64/libc.so.6

I just was not able to find out which was the cause as it seems to be in a local std::vector destructor.. that made no sense for me.

(gdb) thread 1
[Switching to thread 1 (Thread 0x7ff248d6c700 (LWP 16767))]#0  0x00007ff2572d28a5 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007ff2572d28a5 in raise () from /lib64/libc.so.6
#1  0x00007ff2572d4085 in abort () from /lib64/libc.so.6
#2  0x00007ff25730fa37 in __libc_message () from /lib64/libc.so.6
#3  0x00007ff257315366 in malloc_printerr () from /lib64/libc.so.6
#4  0x00007ff257317e93 in _int_free () from /lib64/libc.so.6
#5  0x000000000044dd45 in deallocate (this=0x7ff250389610) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/ext/new_allocator.h:95
#6  _M_deallocate (this=0x7ff250389610) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:146
#7  ~_Vector_base (this=0x7ff250389610) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:132
#8  ~vector (this=0x7ff250389610) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:313
#9  ...

Studing deeper the code I realized that the vector was initialized using another vector comming from other thread and, here is the point, no mutex was used to do that. In order to simplify I wrote this code that reproduces that. (please ignore that stopThread is not protected)

void* doWork(void*)
{
    while(!stopThread)
    {
        double min = std::numeric_limits<int>::max();
        double max = std::numeric_limits<int>::min();
        pthread_mutex_lock(&_mutex);
        std::vector<double> localVector = (sharedVector);
        sharedVector.clear();
        pthread_mutex_unlock(&_mutex);

        for(unsigned int index = 0; index < localVector.size(); ++index)
        {
            std::cout << "Thread 2 " << localVector[index] << ", " << std::endl;
            if(min > localVector[index])
            {
                min = localVector[index];
            }
            if(max < localVector[index])
            {
                max = localVector[index];
            }
        }
    }
    return NULL;
}

int main()
{
    pthread_mutex_init(&_mutex, NULL);
    stopThread = false;

    pthread_create(&_thread, NULL, doWork, NULL);

    for(int i = 0; i < 10000; i++)
    {
        sharedVector.push_back(i);
        std::cout << "Thread 1 " << i << std::endl;
        usleep(5000);
    }
    stopThread = true;

    pthread_join(_thread, NULL);
    pthread_cancel(_thread);

    std::cout << "Finished! " << std::endl;
}

I fixed that but I cannot say that I solved the problem (I know I fixed a problem but not the problem I was looking for) as the core happens once per month more or less. So I decided to compile using O0 to see If i can see more details in the core file and then I forced the program to crash. Now, what I have is a Segfault where I expected.

Program terminated with signal 11, Segmentation fault.
#0  0x00007f4598f70cd7 in memmove () from /lib64/libc.so.6

(gdb) bt
#0  0x00007f4598f70cd7 in memmove () from /lib64/libc.so.6
#1  0x000000000045fb84 in std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<double> (__first=0x7f4580977ba0, __last=0x7f4580977ba8, __result=0x0)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_algobase.h:378
#2  0x0000000000465f01 in std::__copy_move_a<false, double const*, double*> (__first=0x7f4580977ba0, __last=0x7f4580977ba8, __result=0x0) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_algobase.h:397
#3  0x0000000000465e66 in std::__copy_move_a2<false, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*> (__first=4.3559999999999999, __last=3.1560000000000001, __result=0x0)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_algobase.h:436
#4  0x0000000000465d6d in std::copy<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*> (__first=4.3559999999999999, __last=3.1560000000000001, __result=0x0)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_algobase.h:468
#5  0x0000000000465c84 in std::__uninitialized_copy<true>::uninitialized_copy<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*> (__first=4.3559999999999999, __last=3.1560000000000001, 
    __result=0x0) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_uninitialized.h:93
#6  0x0000000000465ad9 in std::uninitialized_copy<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*> (__first=4.3559999999999999, __last=3.1560000000000001, __result=0x0)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_uninitialized.h:117
#7  0x0000000000465718 in std::__uninitialized_copy_a<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*, double> (__first=4.3559999999999999, __last=3.1560000000000001, __result=0x0)
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_uninitialized.h:257
#8  0x00000000004650f9 in std::vector<double, std::allocator<double> >::vector (this=0x7f4594d90d70, __x=std::vector of length 1, capacity 4 = {...})
    at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:243
#9  ...

I look for some documentation but i found nothing saying that the type of error can change due to the optimization. However, I run the code above, that reproduces the problem and compiling with O0 a Segmentation fault happens but compiling with O2 it finishs fine.

Thanks for your time

Upvotes: 1

Views: 1287

Answers (1)

Mike Seymour
Mike Seymour

Reputation: 254501

You're locking the mutex while the worker thread access the shared vector; but not when the main thread modifies it. You need to guard all accesses to shared mutable data.

for(int i = 0; i < 10000; i++)
{
    pthread_mutex_lock(&_mutex);                // Add this
    sharedVector.push_back(i);
    pthread_mutex_unlock(&_mutex);              // Add this
    std::cout << "Thread 1 " << i << std::endl;
    usleep(5000);
}

You might also consider using a condition variable to notify the worker thread when the vector changes, so that the worker doesn't consume resources busy-waiting.

Upvotes: 9

Related Questions