What is the optimal precision for training a typical Deep Neural Network?

I am just researching the optimal precision for training a DNN. I learned that, for inference, even a compressed 8-bit precision should would work; for training, we would need a higher precision numbers. What would be the optimal precision for Deep Learning (fp16, fp32 or fp64)? I may use tensorflow-gpu for this purpose.

Upvotes: 0

Views: 520

Answers (2)

Prune
Prune

Reputation: 77847

This depends on the evaluation function you have for "optimal": is your focus training time (less precision is faster), accuracy (less precision often is less accurate), or some other resource? This is also somewhat dependent on the model complexity and topology.

ConvNet (MNIST) will do fine on 8-bit floats; training is faster, and the accuracy difference (if any) will be insignificant. If you move to something more interdependent and fragile (perhaps a kernel-starved GNN), then you'll notice a loss of accuracy in dropping to 8-bit.

Again depending on your needs, you can sometimes save training time by dropping to 8-bit floats, but recover some lost accuracy by widening your model (more kernels in convolution layers) by a small amount.

Upvotes: 0

BlueSun
BlueSun

Reputation: 3570

The optimal precision is float32 in most cases. float64 will make the execution on the gpu significantly slower. On the other hand, unless you have a tesla p100 GPU, using float16 will not make the execution faster.

Upvotes: 2

Related Questions