AMD GPU compute with C++

Question

I'm new to the GPU compute world, and I'm trying to run some demo code on my AMD Radeon RX 7800 XT GPU using ROCm 5.7 / HIP.

I've looked trough some sources, but I don't seem to find any answers on why the provided code doesn't work. I've been following this tutorial with this repository.

The first problem was that I needed to insert #define __HIP_PLATFORM_AMD__ at the beginning of the file. Why does the tutorial not mention this?

But the major problem is that in the line int id = blockDim.x * blockIdx.x + threadIdx.x; the compiler gives these errors:

GPUCompute\main.cpp line 16 error: use of undeclared identifier 'blockDim'
GPUCompute\main.cpp line 16 error: use of undeclared identifier 'blockIdx'
GPUCompute\main.cpp line 16 error: use of undeclared identifier 'threadIdx'
GPUCompute\main.cpp line 56 error: use of undeclared identifier 'hipLaunchKernelGGL'

Note that I shortened the path for privacy reasons.

What should I do? Do I need to declare them? If yes, how do I know what value to set them?

My code is:

#define __HIP_PLATFORM_AMD__

#include 
#include 
#include 
#include 

// Size of array
#define N 1048576

using namespace std;

// Kernel
__global__ void vector_addition(double *a, double *b, double *c)
{
    int id = blockDim.x * blockIdx.x + threadIdx.x;
    if(id < N)
        c[id] = a[id] + b[id];
}

// Main program
int main()
{
    // Number of bytes to allocate for N doubles
    size_t bytes = N*sizeof(double);

    // Allocate memory for arrays A, B, and C on host
    double *A = (double*)malloc(bytes);
    double *B = (double*)malloc(bytes);
    double *C = (double*)malloc(bytes);

    // Allocate memory for arrays d_A, d_B, and d_C on device
    double *d_A, *d_B, *d_C;
    hipMalloc(&d_A, bytes);
    hipMalloc(&d_B, bytes);
    hipMalloc(&d_C, bytes);

    // Fill host arrays A, B, and C
    for(int i=0; i tolerance)
        {
            printf("Error: value of C[%d] = %f instead of 3.0
", i, C[i]);
            exit(-1);
        }
    }

    // Free CPU memory
    free(A);
    free(B);
    free(C);

    // Free GPU memory
    hipFree(d_A);
    hipFree(d_B);
    hipFree(d_C);

    printf("
---------------------------
");
    printf("__SUCCESS__
");
    printf("---------------------------
");
    printf("N                 = %d
", N);
    printf("Threads Per Block = %d
", thr_per_blk);
    printf("Blocks In Grid    = %d
", blk_in_grid);
    printf("---------------------------

");

    return 0;
}

Note: I'm coding in Code::Blocks and (I think) I set up the compiler found in the directory of ROCm. I set the search directories for both the linker and the compiler.

My specifications (if one would care) are:

CPU: Ryzen 7 7700X
GPU: Radeon RX 7800 XT with 16 GB VRAM
RAM: 32 GB
OS: Windows 11

And for those who are interested in the desired behavior: I just simply want it to compile and perform the vector additions on my GPU.

AMD GPU compute with C++

Answers (0)

Related Questions