baptiste
baptiste

Reputation: 1169

OpenCL Compute Units Infos

I am currently working on an i.MX6.Q platform embedding a Vivante GC2100 GPU. On the (VERY SHORT) technical specifications provided Vivante, it says I got 4 shader cores if I vectorize and 16 if I don't (http://www.vivantecorp.com/index.php/en/technology/gpgpu.html).

When I directly recover OpenCL infos about my GPU it says I got 4 compute units and the preferred vector width is 4.

Does that mean the GPU will automatically detect if I vectorize or not? Will it always correctly use all the cores he can (in the current version of my program I didn't developed vectorization) and is there a way to be sure of it?

If I dont use aligned data, does I still have to vectorize to benefit the gpu capabilities or can I just keep using my gpu without vectorization? I am currently benching the i.MX6.Q for OpenCL so I will vectorize what I can anyway and see it by myself, but if you guys know some theory about it, I take it !!

Baptiste

Upvotes: 1

Views: 536

Answers (1)

Ani
Ani

Reputation: 10896

It depends on whether your kernel is vectorizable by the particular OpenCL compiler you're using. If you keep your data unpacked (all single floats) then it is possible that your compiler may be able to perform work-item vectorization.

In fact, for this reason CUDA does not implement vector types or operations unlike OpenCL. I would advise against manual packing just because it makes things more complicated for a compiler to auto-vectorize.

The GPU doesn't detect or "use" vectorization - the compiler generates the correct object code that uses vectorized instructions where possible (based on your logic). As for finding out if your particular kernel was vectorized or not, you'll have to refer to the documentation/tools of your implementation. As for using all cores, that depends on the global worksize. If you don't submit enough work for all the SMs on the GPU to be busy then it will be underutilized.

Also, Note that most OpenCL implementation will prefer (and allocate) aligned data unless you specifically prevent it using the packed attribute.

Upvotes: 2

Related Questions