gil
gil

Reputation: 2298

CUDA vs Direct X 10 for parallel mathematics. any thoughs you have about it?

CUDA vs Direct X 10 for parallel mathematics. any thoughs you have about it ?

Upvotes: 0

Views: 1590

Answers (5)

ArchaeaSoftware
ArchaeaSoftware

Reputation: 4422

It should be easy to decide between them.

If your app can tolerate being Windows specific, you can still consider DirectX Compute. Otherwise, use CUDA or OpenCL.

If your app cannot tolerate a vendor lock on NVIDIA, you cannot use CUDA, you must use OpenCL or DirectX Compute.

If your app is doing DirectX interop, consider that CUDA/OpenCL will incur context switch overhead doing graphics API interop, and DirectX Compute will not.

Unless one or more of those criteria affect your application, use the great granddaddy of massively parallel toolchains: CUDA.

Upvotes: 0

Gustavo Muenz
Gustavo Muenz

Reputation: 9552

CUDA has nothing to do about supporting double precision floating point operations. This is dependent on the hardware available. The 9, 100, 200 and Tesla series support double precision floating point operations tesla.

Upvotes: 0

Mike
Mike

Reputation: 4590

CUDA is probably a better option, if you know your target architecture is using nVidia chips. You have complete control over your data transfers, instruction paths and order of operations. You can also get by with a lot less __syncthreads calls when you're working on the lower level.

DirectX 10 will be easier to interface against, I should think, but if you really want to push your speed optimization, you have to bypass the extra layer. DirectX 10 will also not know when to use texture memory versus constant memory versus shared memory as well as you will depending on your particular algorithm.

If you have access to a Tesla C1060 or something like that, CUDA is by far the better choice hands down. You can really speed things up if you know the specifics of your GPGPU - I've seen 188x speedups in one particular algorithm on a Tesla versus my desktop.

Upvotes: 3

that_should_work
that_should_work

Reputation:

I find CUDA awkward. It's not C, but a subset of it. It doesn't support double precision floating point natively and is emulated. For single precision it's okay though. It depends on the type of task you throw at it. You have to spend more time computing in parallel than you spend passing the data around for it to be worth using. But that issue is not unique to CUDA.

I'd wait for Apple's OpenCL which seems like it will be the industry standard for parallel computing.

Upvotes: 1

dicroce
dicroce

Reputation: 46770

Well, CUDA is portable... That's a big win if you ask me...

Upvotes: 0

Related Questions