Reputation: 73
I had to use a homebrewed gcc
until now to compile OMP-augmented code on my Mac.
Good news is, Apple Clang is now able to find the OMP headers (at least in its Apple LLVM version 9.1.0 (clang-902.0.39.2)
version).
Bad news are that custom reduction clauses that used to work do not anymore. I've attached the code snippet below that demonstrates my problem. It immediately crashes upon entering the parallel block with either a segfault or the following error:
DebugOMP(46436,0x7fff8fc12380) malloc: *** error for object 0x7fff8fc02000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Is there a way to fix this reduction? Simpler OMP clauses like #pragma omp parallel for
work fine. I am using Armadillo 9.100.5. The same issue happens with Eigen.
main.cpp :
#include <armadillo>
#pragma omp declare reduction( + : arma::vec : omp_out += omp_in ) \
initializer( omp_priv = omp_orig )
int main() {
int N = 10000;
int M = 100;
double a = 0;
// Built-in reduction, works
#pragma omp parallel for reduction(+:a)
for (int k = 0; k < M; ++k){
a += k;
}
std::cout << a << std::endl;
arma::vec v = arma::zeros<arma::vec>(M);
// Parallel access, works
#pragma omp parallel for
for (int k = 0; k < M; ++k){
v(k) = k;
}
std::cout << v << std::endl;
// Custom, reduction, segfaults
#pragma omp parallel for reduction(+:v)
for (int i = 0; i < N; ++i){
v += arma::ones<arma::vec>(v.n_rows);
}
std::cout << v << std::endl;
return 0;
}
CMakeLists.txt:
cmake_minimum_required(VERSION 3.0.0)
# Building procedure
get_filename_component(dirName ${CMAKE_CURRENT_SOURCE_DIR} NAME)
set(EXE_NAME ${dirName} CACHE STRING "Name of executable to be created.")
project(${EXE_NAME})
# Find Armadillo
find_package(Armadillo REQUIRED )
include_directories(${ARMADILLO_INCLUDE_DIRS})
# Find OpenMP
find_package(OpenMP)
if(OPENMP_FOUND)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${OpenMP_C_FLAGS}")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
endif()
# Add source files in root directory
add_executable(${EXE_NAME}
main.cpp)
# Linking
set(library_dependencies ${ARMADILLO_LIBRARIES} )
target_link_libraries(${EXE_NAME} ${library_dependencies} OpenMP::OpenMP_CXX)
Upvotes: 3
Views: 603
Reputation: 33679
You can manually do the reduction like this
#pragma omp parallel
{
arma::vec t = arma::zeros<arma::vec>(M);
#pragma omp for nowait
for (int i = 0; i < N; ++i) t += arma::ones<arma::vec>(v.n_rows);
#pragma omp critical
v += t;
}
That works with Clang. This could help you figure out how to define the initializer-expr
For example this works with GCC 7
#pragma omp declare reduction( + : arma::vec : omp_out += omp_in ) \
initializer( omp_priv = arma::zeros<arma::vec>(omp_orig.n_rows))
However with Clang 5.0 the code hangs so I'm not sure what Clang's problem is. I tried other initializer-expr variations but none of them got Clang to work.
I installed clang7 and the OP's code works fine. In general I think it's a better idea to be explicitly set the vector to zero like this
initializer( omp_priv = arma::zeros<arma::vec>(omp_orig.n_rows))
rather than implicitly like this
initializer(omp_priv = omp_orig)
because the implicit case assumes the constructor initializes to zero.
Upvotes: 1
Reputation: 29205
Since the original question also asked about Eigen, here is a self-contained example working with gcc 5, 6, 7, 8 and clang 6. It segfaults with clang 5, likely a bug on clang 5 side. This is essentially the same solution as proposed by Z boson.
#include <Eigen/Core>
#include <iostream>
using namespace Eigen;
typedef VectorXd vec;
#pragma omp declare reduction( + : vec : omp_out += omp_in ) \
initializer( omp_priv = vec::Zero(omp_orig.size()) )
int main() {
int N = 10000;
int M = 100;
vec v = vec::LinSpaced(M,0,M-1);
#pragma omp parallel for reduction(+:v)
for (int i = 0; i < N; ++i){
v += vec::Ones(v.size());
}
std::cout << v.transpose() << std::endl;
return 0;
}
Upvotes: 0