Reputation: 1
I am writing an OpenCL code to assemble a sparse matrix from Finite Element discretization and I would appreciate some tip regarding a clever structure I could use to assemble this matrix in kernel code! I mean, I need to access a random matrix position in the kernel!
Upvotes: 0
Views: 310
Reputation: 5924
Random access to a large data set is taxing on the GPU. I would not allow all the kernels to randomly write into one master table. Doing that would probably result in worse performance than on a serial CPU.
Instead, I would probably give each kernel its own chunk of memory to work on. Maybe each one should assemble part of the matrix using a small coordinate list with (row, column, value) tuples. Each kernel should just work on it's own chunk of memory when assembling the matrix data, and then I would pull that data back to the CPU to be sorted and reconfigured to a more efficient format.
If you need to do further work on the sorted matrix data, it would be best to create a second kernel. Kernels run best on straightforward tasks.
Upvotes: 1