Reputation: 5524
I have an OpenGL program that will be do a fairly good amount of matrix multiplies per second. These will be 4x4 matrices and 128 bytes each. Both my CPU and GPU are pretty up to date (I have a MacBook Pro (Retina, 13-inch, Mid 2014)). I know that GPUs are typically more parallel oriented and might be optimized for this kind of stuff. Would it be faster to have the CPU do the multiplies or my GPU?
Upvotes: 3
Views: 532
Reputation: 162317
I have an OpenGL program that will be do a fairly good amount of matrix multiplies per second.
Define "fairly good amount of matrix multiplies". Keep in mind the CPUs, too are quite capable in doing this kind of computation. With vectorizing instruction set a 4×4 matrix-matrix multiplication boils down to as little as 16 FMA (fused multiply add) instructions. That's not a lot. And given that modern CPUs want to be kept busy as well and often you need the matrices for on-CPU computations as well, it makes a lot of sense to keep the matrix computations on the CPU.
Doing it on the GPU only benefits, if you can parallelize the computation of all those matrices easily. For a single 4×4 matrix-matrix multiply the overhead of loading the matrices onto the GPU and doing the housekeeping easily consumes any performance benefits.
Upvotes: 2