Reputation: 23
Using different gcc optimizations my program dies due different OS signals and I wonder if the cause is the same or not.
I was getting a core dump due a abort() in a c++ multithread program compiled using O2.
Program terminated with signal 6, Aborted.
#0 0x00007ff2572d28a5 in raise () from /lib64/libc.so.6
I just was not able to find out which was the cause as it seems to be in a local std::vector destructor.. that made no sense for me.
(gdb) thread 1
[Switching to thread 1 (Thread 0x7ff248d6c700 (LWP 16767))]#0 0x00007ff2572d28a5 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ff2572d28a5 in raise () from /lib64/libc.so.6
#1 0x00007ff2572d4085 in abort () from /lib64/libc.so.6
#2 0x00007ff25730fa37 in __libc_message () from /lib64/libc.so.6
#3 0x00007ff257315366 in malloc_printerr () from /lib64/libc.so.6
#4 0x00007ff257317e93 in _int_free () from /lib64/libc.so.6
#5 0x000000000044dd45 in deallocate (this=0x7ff250389610) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/ext/new_allocator.h:95
#6 _M_deallocate (this=0x7ff250389610) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:146
#7 ~_Vector_base (this=0x7ff250389610) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:132
#8 ~vector (this=0x7ff250389610) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:313
#9 ...
Studing deeper the code I realized that the vector was initialized using another vector comming from other thread and, here is the point, no mutex was used to do that. In order to simplify I wrote this code that reproduces that. (please ignore that stopThread is not protected)
void* doWork(void*)
{
while(!stopThread)
{
double min = std::numeric_limits<int>::max();
double max = std::numeric_limits<int>::min();
pthread_mutex_lock(&_mutex);
std::vector<double> localVector = (sharedVector);
sharedVector.clear();
pthread_mutex_unlock(&_mutex);
for(unsigned int index = 0; index < localVector.size(); ++index)
{
std::cout << "Thread 2 " << localVector[index] << ", " << std::endl;
if(min > localVector[index])
{
min = localVector[index];
}
if(max < localVector[index])
{
max = localVector[index];
}
}
}
return NULL;
}
int main()
{
pthread_mutex_init(&_mutex, NULL);
stopThread = false;
pthread_create(&_thread, NULL, doWork, NULL);
for(int i = 0; i < 10000; i++)
{
sharedVector.push_back(i);
std::cout << "Thread 1 " << i << std::endl;
usleep(5000);
}
stopThread = true;
pthread_join(_thread, NULL);
pthread_cancel(_thread);
std::cout << "Finished! " << std::endl;
}
I fixed that but I cannot say that I solved the problem (I know I fixed a problem but not the problem I was looking for) as the core happens once per month more or less. So I decided to compile using O0 to see If i can see more details in the core file and then I forced the program to crash. Now, what I have is a Segfault where I expected.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f4598f70cd7 in memmove () from /lib64/libc.so.6
(gdb) bt
#0 0x00007f4598f70cd7 in memmove () from /lib64/libc.so.6
#1 0x000000000045fb84 in std::__copy_move<false, true, std::random_access_iterator_tag>::__copy_m<double> (__first=0x7f4580977ba0, __last=0x7f4580977ba8, __result=0x0)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_algobase.h:378
#2 0x0000000000465f01 in std::__copy_move_a<false, double const*, double*> (__first=0x7f4580977ba0, __last=0x7f4580977ba8, __result=0x0) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_algobase.h:397
#3 0x0000000000465e66 in std::__copy_move_a2<false, __gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*> (__first=4.3559999999999999, __last=3.1560000000000001, __result=0x0)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_algobase.h:436
#4 0x0000000000465d6d in std::copy<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*> (__first=4.3559999999999999, __last=3.1560000000000001, __result=0x0)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_algobase.h:468
#5 0x0000000000465c84 in std::__uninitialized_copy<true>::uninitialized_copy<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*> (__first=4.3559999999999999, __last=3.1560000000000001,
__result=0x0) at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_uninitialized.h:93
#6 0x0000000000465ad9 in std::uninitialized_copy<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*> (__first=4.3559999999999999, __last=3.1560000000000001, __result=0x0)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_uninitialized.h:117
#7 0x0000000000465718 in std::__uninitialized_copy_a<__gnu_cxx::__normal_iterator<double const*, std::vector<double, std::allocator<double> > >, double*, double> (__first=4.3559999999999999, __last=3.1560000000000001, __result=0x0)
at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_uninitialized.h:257
#8 0x00000000004650f9 in std::vector<double, std::allocator<double> >::vector (this=0x7f4594d90d70, __x=std::vector of length 1, capacity 4 = {...})
at /usr/lib/gcc/x86_64-redhat-linux/4.4.6/../../../../include/c++/4.4.6/bits/stl_vector.h:243
#9 ...
I look for some documentation but i found nothing saying that the type of error can change due to the optimization. However, I run the code above, that reproduces the problem and compiling with O0 a Segmentation fault happens but compiling with O2 it finishs fine.
Thanks for your time
Upvotes: 1
Views: 1287
Reputation: 254501
You're locking the mutex while the worker thread access the shared vector; but not when the main thread modifies it. You need to guard all accesses to shared mutable data.
for(int i = 0; i < 10000; i++)
{
pthread_mutex_lock(&_mutex); // Add this
sharedVector.push_back(i);
pthread_mutex_unlock(&_mutex); // Add this
std::cout << "Thread 1 " << i << std::endl;
usleep(5000);
}
You might also consider using a condition variable to notify the worker thread when the vector changes, so that the worker doesn't consume resources busy-waiting.
Upvotes: 9