Dmitry
Dmitry

Reputation: 14622

How faster is tensorflow-gpu with AVX and AVX2 compared with it without AVX and AVX2?

How faster is tensorflow-gpu with AVX and AVX2 compared with it without AVX and AVX2?

I tried to find an answer using Google but with no success. It's hard to recompile tensorflow-gpu for Windows. So, I want to know if it worth it.

Upvotes: 14

Views: 5346

Answers (1)

Yaroslav Bulatov
Yaroslav Bulatov

Reputation: 57923

If your computation is one giant matmul on CPU, you will get 3x speed-up on Xeon V3 (see benchmark here). But it's also possible to see no speed-up, presumably because there's not enough time spent in high arithmetic intensity ops executed on CPU.

Here's a table from "High Performance Models" guide for training of resnet50 on CPU with difference optimizations. It looks like you can get 2.5 speed-up with best settings

| Optimization | Data Format | Images/Sec   | Intra threads | Inter Threads |
:              :             : (step time)  :               :               :
| ------------ | ----------- | ------------ | ------------- | ------------- |
| AVX2         | NHWC        | 6.8 (147ms)  | 4             | 0             |
| MKL          | NCHW        | 6.6 (151ms)  | 4             | 1             |
| MKL          | NHWC        | 5.95 (168ms) | 4             | 1             |
| AVX          | NHWC        | 4.7 (211ms)  | 4             | 0             |
| SSE3         | NHWC        | 2.7 (370ms)  | 4             | 0             |

If you are able to compile an optimized version for Windows, it would help to mention it in this issue -- https://github.com/yaroslavvb/tensorflow-community-wheels/issues/13 , it seems there's some demand for such a build

Upvotes: 12

Related Questions