Reputation: 432
I have the following routine that is supposed to copy elements from src matrix to the dst matrix based on the indexes found in index. The index is computed correctly but the dst is not updated. What am I missing?
__kernel void
src_indexed_copy(__global real *dst, __global const real *src,
__global const int *index, int src_offset)
{
int id = get_global_id(ROW_DIM);
int src_idx = src_offset + index[id];
dst[id] = src[src_idx];
}
The global workspace has as many work items as there are indices in the index array.
The linear code would look something like this:
for (k = 0; k < n; k++) {
dst[k] = src[m * column + index[k]];
}
Which copies all the indexed elements from column column in matrix src.
This is how I am reading the buffer back (asked in comments):
rc = clEnqueueReadBuffer(ompctx->clctx.queue, c,
CL_TRUE, 0, i * sizeof(real), &tmp[0],
0, NULL, NULL);
if (rc != CL_SUCCESS) {
log_error("omp", "[%d] readbuf() failed", rc);
goto err;
}
log_info("omp", "c");
for (k = 0; k < i; k++) {
log_info("omp", "%6.8f", tmp[k]);
}
Upvotes: 0
Views: 61
Reputation: 9906
Something must be wrong in the host code. Please verify:
Upvotes: 1