rprospero
rprospero

Reputation: 961

Accelerate code passes intepreter but fails under CUDA

I have been trying to write a function in that will take a histogram of a vector using the accelerate library. I recognize that histograms aren't the idea case for GPU processing, but I'm generating a fairly large dataset from a small seed and it would be nice if it could be reduced to a few kilobyte array before transferring it back to main memory.

The code that I've come up with is below. It takes a number of output bins then then creates a new array where the values of a[x] is the number of occurrences of x in xs

hist :: A.Exp Int -> A.Acc (A.Vector Int) -> A.Acc (A.Vector Int)
hist bins xs = A.permute
               (const (+1))
               (A.fill (A.index1 bins) 0)
               (A.index1 . (xs A.!))
               xs

The code appears to run properly under the Accelerate interpreter. However, if I try to call it through accelerate-cuda, I get the following error message.

./Data/Array/Accelerate/CUDA/State.hs:85:9: (unhandled): CUDA Exception: unspecified launch failure

My question is two-fold. First, what am I doing that causes CUDA to fail? Second, is there a better way to take a histogram through Accelerate?

Upvotes: 2

Views: 190

Answers (1)

tmcdonell
tmcdonell

Reputation: 61

This was a bug in Accelerate (and/or underlying change in CUDA) which has now been fixed. Apologies for taking so long to get to it, this slipped off my radar.

Upvotes: 1

Related Questions