Chris Gnam
Chris Gnam

Reputation: 590

tbb parallel_for: Object with intrusive list node can be part of only one intrusive list simultaneously

I'm currently porting a program that was originally written on Windows 10 to a redhat system (VERSION_ID="8.8") that has g++ version 8.50. 20210514 (Red Hat 8.5.0-18). I installed tbb with vcpkg. I've reduced my problem to the following minimal working example. (Note, the original project has over a hundred files. I've tried to keep some things from the CMakeLists.txt that may be related, but I'm unsure):

CMakeLists.txt:

cmake_minimum_required(VERSION 3.9)

if (CMAKE_BUILD_TYPE STREQUAL "Debug")
    set(CMAKE_CXX_FLAGS_DEBUG "-g -O0")
    set(CMAKE_BUILD_TYPE Debug)
else()
    set(CMAKE_CXX_FLAGS "-Wall -mtune=native -march=native -g")
    set(CMAKE_CXX_FLAGS_RELEASE "-O3 -fno-math-errno -fno-signed-zeros -fno-trapping-math-freciprocal-math -fno-rounding-math -fno-signaling-nans -fexcess-precision=fast")
endif()

project(Example LANGUAGES CXX VERSION 0.9)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_INTERPROCEDURAL_OPTIMIZATION TRUE) 
set(CMAKE_CXX_VISIBILITY_PRESET hidden)
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR})

add_executable(example
    example.cpp
)

find_package(TBB CONFIG REQUIRED)

target_link_libraries(example
    TBB::tbb
    TBB::tbbmalloc
)

vcpkg.json:

{
    "name": "vt",
    "version-string": "",
    "dependencies": [
        "tbb"
    ],
    "builtin-baseline": "36fb23307e10cc6ffcec566c46c4bb3f567c82c6"
}

example.cpp:

#include <vector>

#include "tbb/blocked_range.h"
#include "tbb/parallel_for.h"

int main()
{
    std::vector<uint32_t> vec(100);
    for (size_t i = 0; i < vec.size(); ++i){
        vec[i] = i;
    }

    tbb::parallel_for(tbb::blocked_range<size_t>(0, vec.size()),
        [&](tbb::blocked_range<size_t> r)
        {
            for (size_t i = r.begin(); i < r.end(); ++i)
            {
                vec[i] = vec[i]+1;
            }
        });

    return 0;
}

This compiles and runs fine on my windows machine (using Visual Studio). It also compiles on the red hat machine, however it will only run in Debug mode. When compiled in Release mode, I get the following error:

Assertion node(val).my_prev_node == &node(val) && node(val).my_next_node == &node(val) failed (located in the push_front function, line in file: 135)
Detailed description: Object with intrusive list node can be part of only one intrusive list simultaneously
Aborted (core dumped)

To build in release mode I run the following:

mkdir build
cd build
cmake -DCMAKE_TOOLCHAIN_FILE=~/vcpkg/scripts/buildsystems/vcpkg.cmake ..
cmake --build .

(For Debug I simply add -DCMAKE_BUILD_TYPE=Debug to the cmake configuration call)


Some things I've tried (that did not work):

I've also tried doing this exact small problem on my personal Ubuntu 22.04.1 LTS machine (with g++ version 11.3.0), and I got the exact same results. It compiles but only works when built as Debug but not Release, giving the same error as above.

Upvotes: 1

Views: 190

Answers (2)

Chris Gnam
Chris Gnam

Reputation: 590

While the above worked for some time, it did come with a performance penalty. Now (with the most up-to-date versions) TBB should work as expected. Just make sure, with linux, you configure cmake with -DCMAKE_BUILD_TYPE=Release

Upvotes: 0

Chris Gnam
Chris Gnam

Reputation: 590

A colleague of mine discovered that adding the following line to the CMakeLists.txt is a work around for now, and he believes he's identified what the underlying problem actually is:

add_compile_definitions(TBB_USE_ASSERT)

Here is a brief summary of what he said. I've added it to an issue I submitted on the vcpkg repository. Hopefully it helps resolve the issue, but if not, hopefully this work-around helps anyone else who stumbles across this same issue. (NOTE: It likely has a performance penalty over a true solution on the vcpkg side of things)

EDIT: Description from my colleague:

Its fundamentally a build system issue. I had to spend some time single stepping through the allocator. Basically, it was asserting that uninitialized memory happened to contain its own address. I had to get a side-by-side setup with a working and broken copy to track where that value got set in the working copy.

The code that sets it is in the class constructor for the intrusive_list_node helper class and was conditional on a preprocessor macro. That preprocessor macro checks if runtime debugging assertions are enabled. If assertions are enabled, it does some extra initialization. If not, it skips that because it would be replaced as soon as the node is used. The only time tbb would ever traverse a newly created list is when sanity checking it on the first insertion.

The actual issue is that vcpkg is building the tbb binary with assertions enabled. When you do a debug build, you also build things with tbb assertions enabled. That tiny one line of initialization code actually ends up getting compiled when building your application, not tbb. This is because it has to do with initializing data you set aside in your program.

So when you built it in release mode, debug assertions were disabled in your code and it skipped the extra initialization. But you then get combined with a version of tbb that was built with runtime assertions enabled. It tries to assert that the internal data structures are intact and that fails because it was left in an uninitialized state that would actually have never been used by anything other than the sanity check.

TLDR: The issue is vcpkg is combining you release mode program with a version of tbb with debugging assertions still enabled. That CMake directive says to compile your code assuming debugging assertions in tbb are enabled.

Upvotes: 1

Related Questions