Reputation: 13437
I'm wondering how would I calculate GFlops for a program of mine like, let's say, a CUDA application.
Do I need to measure the execution time and the number of floating point operations in my code? If I had an operation like "logf", would it count for just one flop?
Upvotes: 2
Views: 1106
Reputation: 129524
The number of ACTUAL floating point operations would depend on exactly how the code is written (compilers can optimize in both directions - that is, merging common operatoions c = (a * 4.0 + b * 4.0);
can becomes c = (a + b) * 4.0
, which is one less than what you wrote. But the compiler can also convert something to MORE operations:
c = a / b;
may turn into:
temp = 1 / b;
c = temp * a;
(This because 1/x is "simpler" than y/x, and multiplication is faster than division).
As mentioned in the comments, some floating point operations (log, sin, cos, etc) will take more than one, often more than ten, operations to get the result.
Another factor to take into account is "loads" and "stores". These can be quite hard to predict, as it is highly dependant on the compilers code generation, number of registers available to the compiler at a given point, etc, etc. Whether loads and stores actually count or not depends on how you look at things, but they certainly count towards the total execution time. If there is a lot of data to work through, but each step is really simple (e.g. c = a + b
where a
, b
and c
are vectors), the time to fetch data from memory is significantly longer than the execution time of add
. On the other hand, c = log(a) + log(b);
would almost certainly "hide" the time to load and store the results, because log
itself takes a lot longer than the load or store operations.
Upvotes: 1