RoyTron
RoyTron

Reputation: 1

How do I properly include files from the NVIDIA C++ standard library?

I was trying to use shared pointers in CUDA by using NVIDIA's version of the C++ standard library. So tried to include <cuda/std/detail/libcxx/include/memory>. I got some errors. One of them includes cannot open source file with <__config> and <__functional_base>. Those files were clearly in the directory. It's like visual studios acts like those files don't exist even though they do. Another error I get is linkage specification is incompatible with previous "" with <cmath>.

I did little digging. I found out that cannot open source file is apparent with every non-header file that starts with _ in cuda/std/detail/libcxx/include/. It is like Visual Studio somehow acts like those files don't exist despite being clearly located in the additional include directories. Furthermore, when I type cuda/std/detail/libcxx/include/, IntelliSense won't find these files. If I can get visual studio to recognize those files, I can properly include any files in NVIDIA's version of standard library.

Upvotes: 0

Views: 1759

Answers (1)

talonmies
talonmies

Reputation: 72349

The first thing to understand is that CUDA doesn't have a C++ standard library. What you are referring to is the libcu++, which is an extremely bare bones heterogenous reimplementation of a tiny subset of what is defined in the C++ standard library. You can use whatever is defined in libcu++ (and that is not much, it is a very incomplete implementation) as follows:

  1. Prepend the local path cuda/std/ to whatever standard library header you are using to substitute the import from the native host C++ standard library to libcu++
  2. Change the namespace from std to cuda::std
  3. compile using nvcc

As a simple example:

$ cat atomics.cpp

#include <iostream>         // std::cout
#include <cuda/std/atomic>  // cuda::std::atomic, cuda::std::atomic_flag, ATOMIC_FLAG_INIT
#include <thread>           // std::thread, std::this_thread::yield
#include <vector>           // std::vector

cuda::std::atomic<bool> ready (false);
cuda::std::atomic_flag winner = ATOMIC_FLAG_INIT;

void count1m (int id) {
  while (!ready) { std::this_thread::yield(); }      // wait for the ready signal
  for (volatile int i=0; i<1000000; ++i) {}          // go!, count to 1 million
  if (!winner.test_and_set()) { std::cout << "thread #" << id << " won!\n"; }
};

int main ()
{
  std::vector<std::thread> threads;
  std::cout << "spawning 10 threads that count to 1 million...\n";
  for (int i=1; i<=10; ++i) threads.push_back(std::thread(count1m,i));
  ready = true;
  for (auto& th : threads) th.join();

  return 0;
}

$ nvcc -std=c++11 -o atomics atomics.cpp -lpthread
$ ./atomics
spawning 10 threads that count to 1 million...
thread #6 won!

Note that as per the documentation, there are presently (CUDA 11.2) only implementations of:

  • <atomic>
  • <latch>
  • <barrier>
  • <semaphore>
  • <chrono>
  • <cfloat>
  • <ratio>
  • <climits>
  • <cstdint>
  • <type_traits>
  • <tuple>
  • <functional>
  • <utility>
  • <version>
  • <cassert>
  • <cstddef>

with complex support coming in the next CUDA release from the looks of things.

You mentioned shared pointers. There is no <memory> implementation at present, so that cannot be made to work.

Upvotes: 2

Related Questions