Reputation: 14622
How faster is tensorflow-gpu
with AVX and AVX2 compared with it without AVX and AVX2?
I tried to find an answer using Google but with no success. It's hard to recompile tensorflow-gpu
for Windows. So, I want to know if it worth it.
Upvotes: 14
Views: 5346
Reputation: 57923
If your computation is one giant matmul on CPU, you will get 3x speed-up on Xeon V3 (see benchmark here). But it's also possible to see no speed-up, presumably because there's not enough time spent in high arithmetic intensity ops executed on CPU.
Here's a table from "High Performance Models" guide for training of resnet50 on CPU with difference optimizations. It looks like you can get 2.5 speed-up with best settings
| Optimization | Data Format | Images/Sec | Intra threads | Inter Threads |
: : : (step time) : : :
| ------------ | ----------- | ------------ | ------------- | ------------- |
| AVX2 | NHWC | 6.8 (147ms) | 4 | 0 |
| MKL | NCHW | 6.6 (151ms) | 4 | 1 |
| MKL | NHWC | 5.95 (168ms) | 4 | 1 |
| AVX | NHWC | 4.7 (211ms) | 4 | 0 |
| SSE3 | NHWC | 2.7 (370ms) | 4 | 0 |
If you are able to compile an optimized version for Windows, it would help to mention it in this issue -- https://github.com/yaroslavvb/tensorflow-community-wheels/issues/13 , it seems there's some demand for such a build
Upvotes: 12