Speed up calculation

Question

I have an simulation application that I have written both in C and CUDA. To measure the speedup I have recorded the time in both cases. In CUDA, I have used CUDA events to measure the time and then dividing the time of GPU by CPU (as usually done). The image of the speedup is provided below.

The weird thing about the speedup graph is that the speedup first increases to 55X and then it decreases to 35X and then again increases as the total number of thread increases. I am not sure why this is happening and how I would be able to figure out the reason behind such an output. I am using a GTX 560ti GPU card with 448 cores. The number of threads for each block is 1024 (maximum number) and so 1 block at a time for each SM. Is it happening because of the occupancy issues and how could I definitely figure out the reason behind this kind of speedup graph?

enter image description here

pQB · Accepted Answer

The peaks in the speedups seems to be related with the execution times in the CPU. Analyzing the GPU time, it seems to increases lineraly with the number of agents. However, the CPU time, which also increases lineraly in general terms, has a drop time in the range [0.6,1.6] aprox, and some peaks in the range [2.6,3.1] aprox.

Taking into account the above, your maximum speedup of 55x decreases in the range [0.6,1.1] aprox. because your CPU time also decreases. Therefore, to calculate the speedup as CPU time / GPU time is normal that the result is smaller. The same applies to the second, in the range [2.6,3.1].

How could I figure out the reason behind this kind of speedup graph? I guess the the CPU was interrupted by some external event (I/O, other program running in the CPU, the OS...).

To calculate more accurately speedups, repeat the experiment 10 times as individual executions, i.e. do not make a loop inside your main function to execute it 10 times. With 10, 20, 30 or even more individual executions you can calculate the mean time, and also the variance. Then, study execution times: one or two peaks may be considered as particular cases (ignore them). If you see a trend, then a deeper study should be done.

Speed up calculation

Answers (1)

Related Questions