Reputation: 151
CNN algorithms like DenseNet DenseNet stress parameter efficiency, which usually results in less FLOPs. However, what I am struggling to understand is why this is important. For DenseNet, in particular, it has low inference speed. Isn't the purpose of decreased parameter size/FLOPs to decrease the time for inference? Is there another real world reason, such as perhaps less energy used, for these optimizations?
Upvotes: 2
Views: 1033
Reputation: 203
There is a difference between overall inference time vs. per parameter/FLOPs training efficiency. Having lower parameter/FLOPs in training does not guarantee higher speed in inference. Because overall inference depends on the architecture and how predictions are computed.
Upvotes: 1