Reputation: 13417
I'm wondering if I can use SIMD intrinsics in a GPU code like a CUDA's kernel or openCL one. Is that possible?
Upvotes: 2
Views: 2458
Reputation:
You use the vector data types built into the OpenCL C language. For example float4 or float8. If you run with the Intel or AMD device drivers these should get converted to SSE/AVX instructions of the vendor's OpenCL device driver. OpenCL includes several functions such as dot(v1, v2) which should use the SSE/AVX dot production instructions. Is there a particular intrinsic you are interested in that you don't think you can get from the OpenCL C language?
Upvotes: 3
Reputation: 587
Yes you can use SIMD intrinsics in the kernel code on CPU or GPU provided the compiler supports usage of these intrinsics.
Usually the better way to use SIMD will be using the Vector datatypes in the kernels so that the compiler decides to use SIMD based on the availablility, this make the kernel code portable as well.
Upvotes: 1
Reputation: 12263
Mostly no, because GPU programming languages use different programming model (SIMT). However, AMD GPU do have an extension to OpenCL which provides intrinsics for some byte-granularity operations (thus allowing to pack 4 values into 32-bit GPU registers). These operations are intended for video processing.
Upvotes: 1
Reputation: 7839
No, SIMD intrinsics are just tiny wrappers for ASM code. They are CPU specific. More about them here.
Generally speking, why whould you do that? CUDA and OpenCL already contain many "functions" which are actually "GPU intrinsics" (all of these, for example, are single-point-math intrinsics for the GPU)
Upvotes: 5