Recommend a fast sorting algorithm for local order among the segment in the array

Question

Sorting data in each segment in an array on GPU, the size of segment is 32, and there are no sorting or merging further for different segments. So I load the data of the each segment into the shared memory from global memory, and store the data into the global memory after I finished sorting of each segment. What's the parallel algorithm is prefer for higher throughput?

Farzad · Accepted Answer

Since segment sizes are all 32, I personally suggest merge sort. There's also this paper you can refer to.

Recommend a fast sorting algorithm for local order among the segment in the array

Answers (2)

Related Questions