Reputation: 1407
I don't have much experience with openmp.
Is it possible to make the following code faster by using a for loop over pointer instead of index?
Are there anyway to make the following code faster?
The code multiplies an array by a constant.
Thank you.
code:
#include <iostream>
#include <stdlib.h>
#include <stdint.h>
#include <vector>
using namespace std;
int main(void){
size_t dim0, dim1;
dim0 = 100;
dim1 = 200;
std::vector<float> vec;
vec.resize(dim0*dim1);
float scalar = 0.9;
size_t size_sq = dim0*dim1;
#pragma omp parallel
{
#pragma omp for
for(size_t i = 0; i < size_sq; ++i){
vec[i] *= scalar;
}
}
}
serial pointer loop
float* ptr_start = vec.data();
float* ptr_end = ptr_start + dim0*dim1;
float* ptr_now;
for(ptr_now = ptr_start; ptr_now != ptr_end; ++ptr_now){
*(ptr_now) *= scalar;
}
Upvotes: 0
Views: 1808
Reputation: 6214
Serial pointer loop should be like
size_t size_sq = vec.size();
float * ptr = vec.data();
#pragma omp parallel
{
#pragma omp for
for(size_t i = 0; i < size_sq; i++){
ptr[i] *= scalar;
}
}
ptr
will be the same for all threads so no problem there.
As an explanation, Data sharing attribute clauses (wikipedia):
shared: the data within a parallel region is shared, which means visible and accessible by all threads simultaneously. By default, all variables in the work sharing region are shared except the loop iteration counter.
private: the data within a parallel region is private to each thread, which means each thread will have a local copy and use it as a temporary variable. A private variable is not initialized and the value is not maintained for use outside the parallel region. By default, the loop iteration counters in the OpenMP loop constructs are private.
In this case, i
is private and ptr
shared.
Upvotes: 1