Framester
Framester

Reputation: 35521

How to estimate relative performance of CUDA gpus?

How can I estimate the cuda performance of cards that I don't own, ie. new cards?

For instance I found an incomplete Cuda example and the author wrote, that it takes him 0,7 s on his GF 8600 GT. But on my Quadro it takes 1,7s.

My question is: Is the code which I used to fill the gaps faulty or is the GF 8600 really twice as fast?

The kernel is memory bound, but my card has an higher memory bandwidth. I don't know what conclusions to draw from this.

Name               Quadro FX 580     GeForce 8600 GT 
CUDA Cores                    32                  32
Core clock (MHz)             450                 540   
Memory clock (MHz)           400                 700
Memory BW (GB/s)              25.6                22.4  
Shader Clock (MHz)          ????                1180  

Upvotes: 3

Views: 899

Answers (1)

Programmer
Programmer

Reputation: 6753

Just want to provide you with some pointers that may be possible sources of error. Firstly, use cudaEvents to time your code, not cuda profiler as cudaEvents is more accurate. Secondly, please check what the author is measuring; is he only talking about the computation time, or is he also considering the time to transfer data to and from the GPU. Are you measuring the same time?

Secondly, the cuda architecture is changing quite fast. For example, for cards with cc 1.x, it is suggested that we should use shared memory to get better performance; however, for cards with cc 2.x, there is a L1 cache with each multiprocessor that makes global memory accesses quite fast. So, you may aslo want to compare the architecture of the two cards and their compute capabilities.

Upvotes: 2

Related Questions