user2531350
user2531350

Reputation: 61

Using LLVM 3.3 backend to compile OpenCL for AMD

How exactly does one use the new R600 backend inside LLVM 3.3 to generate a binary suitable for passing to the OpenCL clCreateProgramWithBinary API on an AMD card? Are there any code samples for how to do this?

I have seen a clang command line for how to compile for AMD, but I havent seen anywhere how to use the output with the driver.

Thanks very much.

Upvotes: 6

Views: 2034

Answers (2)

WON
WON

Reputation: 93

Perhaps you should use libclc to use OpenCL's built-in functions.(https://libclc.llvm.org/)
Unfortunately it requires LLVM to be 3.7 or higher.
This is because LLVM 3.7 and later only supports AMD GPU backend. In LLVM 3.3, there is no opencl front-end in clang, nor amd-gpu backend in llvm.
(clang 3.3: http://releases.llvm.org/3.3/tools/clang/docs/UsersManual.html)
(LLVM 3.3: http://releases.llvm.org/3.3/docs/index.html)
(LLVM 3.7: http://releases.llvm.org/3.7.0/docs/AMDGPUUsage.html)
(I do not know why AMD GPU back-end support is not in the release note.)

So if you want to compile the OpenCL kernel for the AMD GPU, you will need to use the LLVM version 3.7 or later.

If you can not afford to use LLVM 3.3, look for the R600 backend. I do not know exactly, but AMDGPU Backend's former name is R600 Backend. (https://www.phoronix.com/scan.php?page=news_item&px=amd-r600-amdgpu-llvm)

Upvotes: 0

shining
shining

Reputation: 87

You can read the test cases in the llvm/test/CodeGen/R600.

For example: add.ll

;RUN: llc < %s -march=r600 -mcpu=redwood | FileCheck %s

;CHECK: ADD_INT T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}}

;CHECK: ADD_INT * T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}}

;CHECK: ADD_INT * T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}}

;CHECK: ADD_INT * T{{[0-9]+\.[XYZW], T[0-9]+\.[XYZW], T[0-9]+\.[XYZW]}}

define void @test(<4 x i32> addrspace(1)* %out, <4 x i32> addrspace(1)* %in) {

  %b_ptr = getelementptr <4 x i32> addrspace(1)* %in, i32 1

  %a = load <4 x i32> addrspace(1) * %in

  %b = load <4 x i32> addrspace(1) * %b_ptr

  %result = add <4 x i32> %a, %b

  store <4 x i32> %result, <4 x i32> addrspace(1)* %out

  ret void
}

Then you can just directly use the output through clCreateProgramWithBinary.

Upvotes: 1

Related Questions