Reputation: 23
I am new to the OpenMp world and got an error, which i cant fix. The original Code is way to big, so i made a small code to summarize the problem:
I got a more dimensional std::vector
(2d and 3d), which should not be shared between the threads. If i mark them as private, they still cause memory errors, because the threads still share them.
I came up with an fix for that problem: I created 1 more dimension for the 2d vector, so each thread can access his own copy:
myVector[omp_get_thread_num()][1].push_back(i);
I know this is not a smart fix for my problem, but now each thread got their own copy of the 2d Vector.
Now comes the strange part: This still causes memory crashes sometimes if i dont put #pragma omp critial
in front of it.
I dont really understand why it is necessary, because the threads should never access the same memory.
#include <iostream>
#include <omp.h>
#include <vector>
//this should represent my problem(without my fix)
int main(){
std::vector < std::vector < int > > v;
v.resize(3);
#pragma omp parallel for num_threads(2) private(v)
for(int i = 0; i < 10; i++){
v[1].push_back(i);
}
return 0;
}
I hope there is a better solution to make my 2d vector threadprivate.
ps. it is not possible to allocate the vector inside the omp part.
Upvotes: 2
Views: 586
Reputation: 22670
You have to understand that variables coming from an outside scope that are declared private
work as if they are locally declared without an initializer. So each local copy is an empty vector, hence your code can't work.
Generally, it is better with OpenMP to declare private variables locally - that way you avoid a lot of confusion between the "outside value" and "inside private values" which are not connected at all. You can do this by splitting the parallel
and for
directives.
#pragma omp parallel
{
std::vector<std::vector<int>> v;
v.resize(3);
#pragma omp for
for(int i = 0; i < 10; i++){
v[1].push_back(i);
}
}
Note that v
is not available after the parallel region - this is good! In your original example v
is available after the parallel region - but it's value has nothing to do with the value from the threads inside.
If you need to retain the information from v
, you may want to look at reduction, but it depends on your specific use case.
Your approach of myVector[omp_get_thread_num()]
a common naive approach. This code is correct, but in any case where you modify the values of the outermost vector, it has bad performance due to false sharing.
myVector[omp_get_thread_num()].push_back(); // Bad performance
myVector[omp_get_thread_num()][1].push_back(i); // Ok
So it is generally advisable to not do this and use locally declared variables instead. Nevertheless, if this code of yours crashes, there is something else wrong. In that case you need to prepare a minimal reproducible example and please ask a second question (referencing this).
Now threadprivate
is something different than private
. private
is usually what you want and refers to the specific task / scope. In most cases, you don't need or want threadprivate
.
Upvotes: 2
Reputation: 23497
You don't specify many important details about your original code. However, one way might be to create a parallel section first and define private vectors within and then to parallelize the loop. For your exemplary code, it might look like:
int main() {
#pramga omp parallel
{
std::vector<std::vector<int>> v;
v.resize(3);
#pragma omp for
for (int i = 0; i < 10; i++)
v[1].push_back(i);
}
}
Upvotes: 0
Reputation: 1247
You could use the thread_local storage specifier:
int main(){
thread_local std::vector < std::vector < int > > v;
// ...
}
Upvotes: 0