Reputation: 2694
I'm trying to calculate CPU / GPU FLOPS performance but I'm not sure if I'm doing it correctly.
Let's say we have:
This Wiki page says that Kaby Lake CPUs compute 32 FLOPS (single precision FP32) and Pascal cards compute 2 FLOPS (single precision FP32), which means we can compute their total FLOPS performance using the following formulas:
CPU:
TOTAL_FLOPS = 2.8 GHz * 4 cores * 32 FLOPS = 358 GFLOPS
GPU:
TOTAL_FLOPS = 1.3 GHz * 768 cores * 2 FLOPS = 1996 GFLOPS
[SOLVED] Most of the guides I've seen (like this one) are using physical cores in the formula. What I don't understand is why not use threads (logical cores) instead? Weren't threads created specifically to double the floating point calculations performance? Why are we ignoring them then?
Am I doing it correctly at all? I couldn't find a single reliable source for calculating FLOPS, all the information on the internet is contradicting. For the i7 7700HQ Kaby Lake CPU I found FLOPS values as low as 29 GFLOPS even though the formula above gives us 358 GFLOPS. I don't know what to believe.
[EDITED] Is there a cross-platform (Win, Mac, Linux) library in Node.js / Python / C++ that gets all the GPU stats like shading cores, clock, FP32 and FP64 FLOPS values so I could calculate the performance myself, or a library that automatically calculates the max theoretical FP32 and FP64 FLOPS performance by utilizing all available CPU / GPU instruction sets like AVX, SSE, etc? It's quite ridiculous that we cannot just get the FLOPS stats from the CPU / GPU directly, instead we have to download and parse a wiki page to get the value. Even when using C++, it seems (I don't actually know) we have to download the 2 GB CUDA toolkit just to get access to the Nvidia GPU information - which would make it practically impossible to make the app available for others, since no one would download a 2 GB app. The only library I could find is a 40 year old C library, written when advanced instructions didn't even exist yet.
Upvotes: 0
Views: 2867