Framester
Framester

Reputation: 35471

How to measure the gflops of a matrix multiplication kernel?

In the book Programming Massively Parallel Processors the number of gflops is used to compare the efficiency of different matrix multiplication kernels. How would I compute this for my own kernels on my own machine?

Somewhere in the NVIDIA Forums I found this 'algorithm', but I don't know, how valid it is or where the times two comes from.

NumOps = 2 * pow(MatrixSize,3)
gflops = 1.0e-9 * NumOps / ExecutionTime

p.s. please feel free to change the tags...

Upvotes: 6

Views: 5531

Answers (1)

Heatsink
Heatsink

Reputation: 7751

You can measure the GFLOPs by running the algorithm with a large input and measuring the execution time. Then put the execution time and matrix size into that formula. For matrix sizes big enough to keep the entire machine busy, the FLOPs is only weakly dependent on matrix size.

The GPU matrix multiplication algorithm performs the same number of floating-point operations as the naive algorithm.

for (i = 0; i < MatrixSize; i++)
  for (j = 0; j < MatrixSize; j++)
    for (k = 0; k < MatrixSize; k++)
      C[j][i] += A[j][k] * B[k][i];

There are 2 floating-point operations in the loop body, and MatrixSize * MatrixSize * MatrixSize iterations of the loop body, which gives you the formula for NumOps. GFLOPs is just the number of operations per second, divided by 10^9 ('giga').

Upvotes: 8

Related Questions