chasep255
chasep255

Reputation: 12185

ArrayFire AF_BACKEND_CPU not multi-threaded?

I usually use the OpenCL backend when I use ArrayFire. I was using Intel OpenCL on my i7 CPU. When I switched to the AF_BACKEND_CPU backend my code was about 10-15x slower. I checked and noticed that it was only running on one core. I also suspect that it is not using SSE or AVX instructions which accounts for the rest of the slowdown since my processor only has 4 cores. I feel like the ArrayFire cpu backend should be faster. Is there a way to make it multithreaded?

Upvotes: 1

Views: 972

Answers (2)

E. Odj
E. Odj

Reputation: 91

I was wondering the same thing. It turns out in the meantime the milestone has been moved to 3.5.0 (https://github.com/arrayfire/arrayfire/issues/451).

As far as I could see only one core is used from AF by now. So 4 cores is still 3 too much.

In general I'd suggest using AF together with the GPU and creating af::array's only when needed, since there is no way otherwise to hold data just on the GPU or just on the CPU (see How to explicitly get linear indices from arrayfire? and http://forums.accelereyes.com/forums/viewtopic.php?f=17&t=43097&p=61730&hilit=copy+host+memory+into+an+array#p61727 on how to construct af::arrays ad-hoc.)

Also as a general rule of thumb for many tasks the GPU implementation still much fast than the CPU implementation, even if the task is not suited perfectly for the CPU. See for example sorting algorithms which normally involve a lot of branching.

If you insist on using the CPU in parallel you could also try to put OpenMP, MPI or just stl::thread on top of AF and parallelize like this. I didn't gained a lot with stl::thread for sorting operations though.

Upvotes: 0

Zoltan Csati
Zoltan Csati

Reputation: 699

The CPU backend is not yet multithreaded. But from version 3.4.0, I suspect it will change (see "Sparse Support, Thread Safety, Parallel CPU" on https://github.com/arrayfire/arrayfire/milestones)

Upvotes: 2

Related Questions