Karthik Hegde
Karthik Hegde

Reputation: 183

Improving Memory Access time in OpenCL

For an array X in the Global memory, I need to write two values in every Kernel execution.

X[p]=mul1+mul2;
X[p+a]=mul1-mul2;

Here 'a' can range from 0 to very high values. I observed that these two writes slow down my kernel to a great extent.

  1. What is the best way to improve the memory write performance in OpenCL?
  2. Are Coalesced memory writes possible only for intra Kernel writes?

Upvotes: 0

Views: 114

Answers (1)

GaTTaCa
GaTTaCa

Reputation: 489

Assuming p is linearly dependent from your thread ID, you are doing things the right way. You could try to pass X+aas a second argument to your kernel to do Y[p]=mul1-mul2; instead of X[p+a]=mul1-mul2; but I doubt it will be really faster. Concerning your second question, if you are thinking of having two kernels, one performing the addition, the other the substraction and launch them concurrently, you cannot be sure they will be run side-by-side in parallel. Once again I doubt it will be faster in the end.

Upvotes: 0

Related Questions