Abator Abetor
Abator Abetor

Reputation: 2598

How to construct and access libcu++' <cuda/std/mdspan> on the host

libcu++ 2.1.0 includes an mdspan implementation for nvcc for c++14 and later. I tried to implement the mdspan sample code from cppreference (https://en.cppreference.com/w/cpp/container/mdspan) with nvcc 12.0 and libcu++ 2.1.0

I noticed two problems.

First, I am unable to construct the mdspan in the same way as in the example.

no instance of constructor "mdspan" matches the argument list argument types are: (int *, int, int)

Second, the access to mdspan via operator[] does not compile

error: no operator "[]" matches these operands operand types are: cuda::std::__4::mdspan<int, cuda::std::__4::extents<std::size_t, 2UL, 3UL, 2UL>, cuda::std::__4::layout_right, cuda::std::__4::default_accessor> [ std::size_t ]

1. How does one specify an extent in the constructor which is not known at compile-time?

2. How to access the data of mdspan?

Below is my code which does not compile with nvcc -Ilibcudacxx-2.1.0/include/ -std=c++17 main.cu -o main .

#include <cstddef>
#include <vector>
#include <cstdio>

#include <cuda/std/mdspan>
 
int main()
{
    std::vector v{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
 
    //error: no instance of constructor "mdspan" matches the argument list. argument types are: (int *, int, int)
    //auto ms2 = cuda::std::mdspan(v.data(), 2, 6); 
    //auto ms3 = cuda::std::mdspan(v.data(), 2, 3, 2);

    //no compilation error with compile-time extents
    cuda::std::mdspan<int, cuda::std::extents<std::size_t, 2,6>> ms2(v.data()); 
    cuda::std::mdspan<int, cuda::std::extents<std::size_t, 2,3,2>> ms3(v.data()); 
 
    // write data using 2D view
    for (std::size_t i = 0; i != ms2.extent(0); i++)
        for (std::size_t j = 0; j != ms2.extent(1); j++)
            //no operator "[]" matches these operands. operand types are: cuda::std::__4::mdspan<int, cuda::std::__4::extents<std::size_t, 2UL, 6UL>, cuda::std::__4::layout_right, cuda::std::__4::default_accessor<int>> [ std::size_t ]
        ms2[i, j] = i * 1000 + j;
 
    // read back using 3D view
    for (std::size_t i = 0; i != ms3.extent(0); i++)
    {
        printf("slice @ i = %lu\n", i);
        for (std::size_t j = 0; j != ms3.extent(1); j++)
        {
            for (std::size_t k = 0; k != ms3.extent(2); k++)
                printf("%d ",  ms3[i, j, k]);
            printf("\n");
        }
    }
}

Upvotes: 3

Views: 479

Answers (1)

paleonix
paleonix

Reputation: 3029

The libcu++ implementation of mdspan is based on the reference implementation by Kokkos which has some caveats in the readme:

This implementation is fully conforming with the version of mdspan voted into the C++23 draft standard in July 2022. When not in C++23 mode the implementation deviates from the proposal as follows:

C++20

  • implements operator() not operator[]
    • note you can control which operator is available with defining MDSPAN_USE_BRACKET_OPERATOR=[0,1] and MDSPAN_USE_PAREN_OPERATOR=[0,1] irrespective of whether multi dimensional subscript support is detected.

C++17

  • mdspan has a default constructor even in cases where it shouldn't (i.e. all static extents, and default constructible mapping/accessor)
  • the conditional explicit markup is missing, making certain constructors implicit
    • most notably you can implicitly convert from dynamic extent to static extent, which you can't in C++20 mode
  • there is a constraint on layout_left::mapping::stride(), layout_right::mapping::stride() and layout_stride::mapping::stride() that extents_type::rank() > 0 is true, which is not implemented in C++17 or C++14.

C++14

  • deduction guides don't exist
  • submdspan (P2630) is not available - an earlier variant of submdspan is available up to release 0.5 in C++14 mode
  • benchmarks are not available (they need submdspan)

The reason for the reference implementation using operator() instead of operator[] (see section "C++20" above) is that operator[] with multiple arguments is a C++23 feature so it can't be ported to earlier C++ versions. Therefore one needs:

        ms2(i, j) = i * 1000 + j;

and

                printf("%d ",  ms3(i, j, k));

instead.

The constructor that is used in the sample is implemented for C++17 and later (C++14 does not have the necessary CTAD*/deduction guides), or at least they have a test for it. CTAD might be turned off for some (host-) compilers/versions, but there is an alternative way of specifying dynamic extents without CTAD:

int main() {
    ...
    using Ext2D = cuda::std::dextents<int, 2>;
    using Ext3D = cuda::std::dextents<int, 3>;
    auto ms2 = cuda::std::mdspan<int, Ext2D>(v.data(), Ext2D(2, 6));
    // This also works, i.e. CTAD doesn't seem to work,
    // but mdspan's variadic constructor does work in C++14 and C++17
    auto ms3 = cuda::std::mdspan<int, Ext3D>(v.data(), 2, 3, 2);
    ...
}

*CTAD: Class Template Argument Deduction

Upvotes: 5

Related Questions